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Home-brew tests need regulation 


AUS proposal to regulate medical diagnostics from individual labs reflects the tests’ growing 
complexity. Such guidance should be welcomed, not resisted. 


(CDC) published an alert. The agency had learned of a new 

test being used to diagnose Lyme disease, a tick-borne bact- 
erial infection that can cause fatigue, joint pain and nervous-system 
problems. The test, like many others for the disease, had not been 
formally evaluated and approved by governmental regulators, and 
agency scientists worried that the method would churn out too many 
false positives. But because of a regulatory loophole, there was little 
the CDC could do except ask consumers to avoid the tests and urge 
people to seek out the few diagnostics that had been approved by the 
US Food and Drug Administration (FDA). 

The problem extends well beyond Lyme disease. Thousands of other 
‘home-brew’ medical tests — those developed in individual laborato- 
ries and used to guide the diagnosis and treatment of everything from 
cancer to Candida — have largely escaped federal oversight. 

That is now likely to change. On 31 July, the FDA unveiled its plans 
to regulate the field. In doing so, the agency is risking the wrath of 
industry and academic labs alike, which have argued that regulation 
of home-brew tests will slow the development of diagnostics unneces- 
sarily. Yet expanded oversight is warranted, and researchers would do 
well to learn from the FDA’s example. As medical diagnostics become 
more elaborate and more important in health-care decisions, they 
need to be treated with more gravity. 

In 1976, the US Congress declared that most diagnostic tests could 
be considered medical devices and therefore fell under the FDA’ regu- 
latory purview. But at the time, laboratory tests tended to be simple, 
familiar assays performed using components that had been approved 
for clinical use. Typically, physicians and pathologists — often at the 
same institution that carried out the test — interpreted the results. 
Given this relatively safe environment, the FDA exercised its discretion 
and declared that it would not regulate home-brew tests. (The FDA 
does, however, regulate commercial tests that are developed and then 
sold as kits to be used in other labs.) 


() n18 April, the US Centers for Disease Control and Prevention 


A COMPLEX BREW 

Today, the medical-diagnostics field is very different. Tests are used 
more frequently, and in higher-risk settings, to select therapies for 
critically ill patients. Although some familiar tests remain, home-brew 
tests are increasingly carried out using cutting-edge science and tech- 
nologies, and yield results so complex that they require proprietary 
algorithms to parse the data. Genome-wide surveys of gene-expression 
patterns and genomic abnormalities, for example, have emerged as 
attractive ways to select treatments for people with cancer. But they 
present challenges for standardization across labs. 

The business of laboratory testing has also changed, with many tests 
now provided by large companies that mass-market their products. 
The well-known test for cancer-associated mutations in the genes 
BRCA1 and BRCA2, for example — provided by Myriad Genetics of 


Salt Lake City, Utah — is a home-brew assay because its results are 
not independently analysed outside the company. Although that test 
has a substantial body of research backing its veracity, many other 
tests do not. And whereas regulators inspect general techniques and 
equipment at some of these labs, they generally do not ascertain the 

validity of the particular tests the labs deploy. 
The FDA announced its intentions to change this policy at least as 
early as 2010. Opposition was swift and fierce, 


“The propos als and came from both industry and academia. 
could bring The long delay in the release of the FDA's new 
welcome policy prompted rumours of political interfer- 
scientific rigour — ence. In July, five US senators wrote a letter to 
toafieldthat has _ the Office of Management and Budget, which 


becomeunruly.” _ has to review proposed regulations, to ques- 
tion the delay in releasing the FDA's guidance. 
But in another letter sent last month, a host of academic testing 
labs decried efforts to regulate the field, saying that the tests should 
be considered services rather than devices. It is easy to understand 
some of their concerns. The FDA is famously overcommitted and 
under-resourced, and adding to its remit raises fears that the agency 
will be slow to issue approvals, becoming a roadblock to innovation 
just as the technologies are beginning to build up speed. 
Fortunately, the plans unveiled by the FDA may sidestep such 
concerns. The regulations will be phased in gradually, to avoid abrupt 
interruptions of important medical services. And the agency intends 
to focus first on tests that bear the most risk for patients. Low-risk tests 
and those for rare diseases are likely to be excluded from regulation. 
Properly executed, the proposals could bring welcome scientific 
rigour to a field that has become unruly. Some FDA staff say that they 
have struggled to combat the outdated sense of complacency with 
respect to medical tests — and not only in clinical pathology labs. 
Researchers, too, have had to be persuaded that diagnostics deserve 
heightened scrutiny. Too many scientists are still not aware that the 
agency needs to review trials involving medical tests — for example, 
clinical trials that select cancer therapies on the basis of mutations 
found in a participant’s tumour. If such a trial is considered sufficiently 
risky, the FDA may require further evidence that the test is valid. 
Researchers sometimes chafe at these rules. But in 2010, Duke 
University in Durham, North Carolina, ended three clinical trials 
designed to determine whether gene-expression profiles could predict 
patient responses to lung-cancer therapies. The trials were based on 
results from cancer researcher Anil Potti, and were terminated well 
after other scientists reported flaws in his analyses. Those flaws might 
have been acknowledged earlier if the FDA had been consulted before 
the trials started. 
With its proposal to regulate home-brew tests, the FDA is respond- 
ing to a changing medical climate. Researchers must be willing to do 
the same. = 
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whose €11-billion (US$15-billion) annual budget will make him 
or her, at least in theory, the most influential figure in European 
science policy. 

Nations are now scrambling to pick and send to Brussels one com- 
missioner each, to provide a pool of 28 from which the research head 
and others will be plucked. Research commissioner is not the most 
prestigious appointment for some of these people. But it is a crucial 
one for Europe’s researchers, many of whom are spending rather too 
much time grumpily pondering career prospects — their own, and 
others’ — in the United States or Asia. 

The right appointment could help to lift their morale. The wrong 
one could squander the promise of Horizon 
2020, the €80-billion, 7-year research-and- 
innovation programme that the European Union 
(EU) instigated this year. 

The longlist is far from complete, so it is too 
early to speculate on who will get the job. What 
is known is that one of their first tasks will be to 
get Horizon 2020 firmly back on the rails. Some 
elements of the programme, notably the Euro- 
pean Research Council, are in reassuringly rude 
health. But there are already ominous murmurs 
among researchers that Horizon 2020 could 
fail to deliver on its promise to address ‘grand 
challenges’ such as ageing and climate change. 

Horizon 2020 relies on an array of old 
‘instruments’ with unsexy names, such as Joint 
Technology Initiatives, to tackle these challenges. 
But it is not clear that the mix of instruments has the necessary cohe- 
sion to make a visible impact on the challenges. And many talented 
university researchers, who still live out their lives in disciplinary silos, 
seem to have baulked at applying for early Horizon 2020 calls that are 
phrased in terms of those broad, societal goals. 

While addressing these problems, the commissioner will have to 
calm ongoing turmoil in the administration of the research directorate 
itself. Hundreds of staff members who deal with research proposals are 
being dispersed to agencies outside the commission. They are unlikely 
to go quietly. Such extensive reorganization tends — at least in the short 
term — to trigger turf wars and backbiting that lower morale and clog 
the system. 

The default position of the staff involved, as with most civil serv- 
ants, is to loftily declare that they expect little — and receive less — in 
the way of support or inspiration from their boss, the commissioner. 
But in real life, leadership does matter. Only a 
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~ W Europe needs a research 
=8 leader who will lead 


The next research commissioner for the European Union will need the drive 
and confidence to clear a daunting in-tray, argues Colin Macilwain. 


vision. That sounds like a cliché, but it happens to be true. The EU’s 
unique seven-year budget process means that the plans for after 2020 
need to be developed on the new commissioner's watch. 

For example, Horizon 2020 was built around three basic ideas — the 
grand challenges, more emphasis on innovation and a larger European 
Research Council — that were firmly in place years ago. That outline 
had taken shape before the financial crisis struck in 2008, hammering 
national budgets and leaving researchers in swathes of Europe with 
minimal funding or job prospects. 

The crisis should have triggered a rethink on how research money 
and other funds could be used to shore up opportunities in regions of 
eastern and southern Europe where the research base is crumbling. 
But despite some late window-dressing, Horizon 
2020 doesnt really take this issue seriously. 

To address this, the commission is under- 
taking a consultation on ‘Science 2.0; the buzz- 
word for its vision of how science should be done 
and organized. What is a peer-reviewed paper? 
Whose data is it based on? Who are its authors? 
As Europe’ largest research funder, the commis- 
sion needs to provide incentives that will encour- 
age scientists to embrace, rather than reject, this 
portentous but hazily defined future. 

With that in-tray, the commissioner needs to 
be the kind of individual who genuinely believes 
that he or she can make a difference — and will 
rise to the challenge of doing so. 

Ministers and commissioners tend to claim to 
have such ambitions when they begin. But their 
default mode is usually that ofa passenger, carried along by officials 
and events. Sadly, the departing commissioner, Maire Geoghegan- 
Quinn, falls squarely into this category. 

It does not have to be that way. It is still possible for stout-hearted 
individuals such as Neelie Kroes, the current digital-agenda commis- 
sioner, to exert real influence. 

The most recent research commissioner to leave a large footprint 
was, unfortunately, Edith Cresson, a former French prime minister, 
whose abuse of the position led to her 1999 resignation and subse- 
quent conviction before the European Court of Justice in 2006. Every 
scientist subjected to the commission's hair-raising auditing process 
since then knows the true cost of Cresson’s legacy. 

Research and innovation is now the third-largest programme in 
the EU (after agriculture, and structural funds to aid poor regions). A 
real leader who knows the ropes politically, and has a clear agenda for 
European research from day one, could change the mood music for 
all of European research. = 


Colin Macilwain writes about science policy from Edinburgh, UK. 
e-mail: cfmworldview@gmail.com 
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RESEARCH HIGHLIGHTS 


Fresh look at 
Galactic rim 


A survey has provided the 
most detailed look yet ata 
mysterious ring of stars at the 
fringes of the Milky Way. 

Using data from the 
Pan-STARRS1 telescope 
in Hawaii, Colin Slater and 
Eric Bell at the University of 
Michigan in Ann Arbor and 
their colleagues show that 
the Monoceros Ring appears 
as wispy stellar streams 
emerging from the Milky 
Way’s outer disk. 

There is debate over how 
the ring was created, with 
theories suggesting that it is 
either a part of the Galactic 
disk that was warped by the 
influence of nearby dwarf 
galaxies or the remnants 
of a dwarf galaxy that was 
unfurled in an encounter 
with the Milky Way. Neither 
scenario explains all the 
details seen in the survey, 
however, suggesting that the 
models need improving. 
Astrophys. J. 791, 9 (2014) 


Brain scans 
predict TV hits 


Brain activity measured in just 


a few individuals watching 
television programmes 
might predict whether large 
populations of viewers will 
find the shows interesting. 
Jacek Dmochowski 
at Stanford University 
in California and his 
colleagues used functional 
magnetic resonance 
imaging (fMRI) or 
electroencephalography 
(EEG) activity to follow 
brain activity in groups 
of up to 16 young adults 
watching a previously 
aired episode of drama 


programme The Walking Dead 


(pictured) or advertisements 
broadcast during American 
football Super Bowl games. 
The extent to which 
neural responses to the 


Selections from the 
scientific literature 


Rubbish is a burning problem 


Open burning of rubbish contributes more than 
a trillion kilograms of carbon dioxide and other 
greenhouse gases to the atmosphere, but is often 
not included in national emissions estimates. 
Christine Wiedinmyer, at the National 
Center for Atmospheric Research in Boulder, 
Colorado, and her colleagues estimated global 
waste-burning emissions on the basis of factors 
such as national population sizes, income 
and waste production and collection. They 
calculated that the CO, generated by open 


stimuli were shared 
between the small 
experimental 
groups correlated 
with the amount 
of social-media 
activity or 
positive audience 
- ratings that 

©) the broadcasts 

" had originally 
elicited from 
large audiences. Such 
neural reliability may be 
a useful tool in targeting 
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education or marketing 
activities to specific groups, 
the authors suggest. 

Nature Commun. 5, 4567 (2014) 


Breaking icebergs 
blast out noise 


Iceberg disintegrations make 
the oceans noisier for months, 
and in the low frequencies that 
might affect marine mammals. 
Researchers led by Haru 
Matsumoto at Oregon State 
University in Newport studied 
ocean hydrophone recordings 
from across the Southern 
Hemisphere. They found that 


waste burning is equivalent to 5% of reported 
global anthropogenic emissions in 2010. In 
some countries, such as Mali and Sri Lanka, 
these emissions exceed those reported by the 
United Nations. 

Emissions from burning rubbish are 
currently not accounted for in climate and 
air-quality models, so could explain some 
discrepancies between observed levels of 
pollutants and those estimated by models. 
Environ. Sci. Technol. http://doi.org/txh (2014) 


noise levels rose throughout 
the southern Pacific Ocean 
for 1.5 years after two huge 
icebergs disintegrated near 
Antarctica between 2007 and 
2009. The signal was detected 
even north of the equator. 
Geochem. Geophys. Geosys. 
http://doi.org/txf (2014) 


Rodents made 
see-through 


The whole body of a rodent 
can be rendered transparent 
for imaging, without 
damaging cells and proteins. 
Previous techniques for 
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making tissue transparent 
have tended to work only on 
specific organs, such as the 
brain. A team led by Viviana 
Gradinaru at the California 
Institute of Technology 

in Pasadena tweaked an 
existing technique that 
stabilizes tissue and strips out 
light-blocking lipids with a 
cocktail of chemicals pumped 
through a dead rodent’s 
circulatory system. Crucially, 
the process maintained the 
integrity of mouse neurons, 
kidney structures (pictured) 
and other tissue. 

After one week for mice 
and two for rats, the brain 
and internal organs were clear 
and could be imaged under 
a microscope. The technique 
could allow researchers to 
see connectivity between the 
brain and other organs. 

Cell http://doi.org/tzz (2014) 


Manta rays 
change colour 


Csilla Ari of the University 

of South Florida in Tampa 
observed five manta rays at 
the Atlantis Aquarium in the 
Bahamas, and discovered 
white markings appearing and 
disappearing in the space of 

a few minutes on their backs, 
fins and heads. 

The changes seemed to 
occur in response to feeding or 
interaction with other manta 
rays, and may represent a form 
of communication. Two of 
the animals were giant manta 
rays (Manta birostris; one 


of two currently recognized 
species), and the other three 
may belong to a possible third 
species, which is similar but 
distinct from M. birostris. 
Body coloration is used to 
identify species and individual 
rays, so the author says that 
understanding these colour 
changes is essential. 
Biol. J. Linn. Soc. 
http://doi.org/txc (2014) 


Best gauge of 
exoplanet size 


Astronomers have made the 
most precise measurement so 
far of an exoplanet’s size — for 
Kepler-93b, which orbits a star 
around 100 parsecs away. 
Sarah Ballard at the 
University of Washington 
in Seattle and her colleagues 
estimated the planet's diameter 
at about 18,800 kilometres 
(1.48 times that of Earth), plus 
or minus 240 kilometres. 
They used NASAs 
Kepler space telescope to 
monitor seismic activity 
inside the planet's host star. 
They also used the Spitzer 
Space Telescope to observe 
Kepler-93b as it transited the 
star, applying a technique 
that ensured that for each 
measurement, light from the 
star fell on the centre of the 
same pixel in Spitzer's camera. 
This allows for precision 
measurements of exoplanets’ 
radii and masses, and even of 
the structure of their parent 
stars, the authors write. 
Astrophys. J. 790, 12 (2014) 


Novae join the 
‘y-ray generators 


Astronomers have identified 
a previously unknown source 
of cosmic y-radiation. 
High-energy y-rays 
are released in extremely 
energetic events such as 
pulsars and supernovae. 
But they were thought to be 
unlikely products of classical 
novae: explosions that occur 
on the surfaces of compact, 
burnt-out stars called white 
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SOCIAL SELECTION 


Furore over genome function 


Just how much of our genome serves a purpose? A recent study 
has reignited this debate on social media. After comparing the 
genomes of 12 different mammals (including humans, mice 
and pandas), researchers at the University of Oxford, UK, 
concluded that only about 8.2% of the human genome is shaped 
by natural selection. The rest, they argue, is non-functional. 
Observers noted the large difference between this estimate 

and a previous claim by the ENCODE (Encyclopedia of DNA 
Elements) Project that 80% of the genome is biochemically 
active. Patrik D*haeseleer, a computational biologist at 
Lawrence Livermore National Laboratory, California, tweeted: 
“Only between 8% and 80% of human #genome is functional. 
Glad we've got that sorted out.” At the heart of the issue are 
differing definitions of ‘function. Erick Loomis, an epigeneticist 
at Imperial College London, tweeted: “Maybe we should stop 
using ‘functional if we can't find a common definition” 


PLoS Genet. 10, €1004525 (2014) 


Based on data from altmetric.com. 
Altmetric is supported by Macmillan 
Science and Education, which owns 
Nature Publishing Group. 


dwarfs as they collect material 
from their neighbours in 
the binary system. The only 
nova previously seen emitting 
such rays came from an 
unusual type of star system. 
Now Teddy Cheung at the 
Naval Research Laboratory 
in Washington DC and his 
colleagues have used NASAs 
Fermi Telescope to detect 
high-energy y-rays coming 
from three classical novae. 
The otherwise 
unremarkable properties of 
the three stars suggest that 
such emissions could be 
common. It is not yet clear 
how particles surrounding 
the stars might be accelerated 
enough to produce the 
energetic radiation. 
Science 345, 554-558 (2014) 


MICROBIOLOGY 


A year with 
your microbes 


Microbial communities 

in the gut and mouth have 
been followed every day 
for an entire year. Stool and 
saliva samples collected 


> NATURE.COM 
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from two men show that the 
communities remain fairly 
stable, but can be rapidly 
and broadly disrupted by 
events such as a bout of food 
poisoning or a holiday to a 
different continent. 

Eric Alm, at the 
Massachusetts Institute of 
Technology in Cambridge, 
and his colleagues analysed 
these samples as well as health 
and lifestyle variables such 
as fitness, diet, exercise and 
mood recorded by the two 
volunteers. 

One of the men developed 
food poisoning, which 
wiped out most of his gut 
bacteria; the microbes 
were eventually replaced 
with genetically similar 
species. And some lifestyle 
changes perturbed specific 
organisms — increasing 
dietary fibre, for instance, 
affected the abundance of 15% 
of the microbes in the gut. 
Genome Biol. 15, R89 (2014) 
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SEVEN DAYS nescnnss 


Study retracted 

A landmark paper proposing 
a link between an influenza 
vaccine and narcolepsy was 
retracted on 31 July. The 
study, published last year 

(A. K. De la Herran-Arita et al. 
Sci. Transl. Med. 5, 216ra176; 
2013), reported that some 
people with narcolepsy had 
immune cells that target a 
wakefulness-maintaining 
neurotransmitter. The cells also 
recognize some components 
of flu vaccines, it said, and the 
results explained why some 
children in Europe developed 
narcolepsy after receiving a 
vaccine for H1N1 swine flu. 
But the team discovered that 
it could not reproduce its own 
findings, and so retracted the 
paper. See go.nature.com/ 
hbrrvi for more. 


DOE opens access 
The US Department of Energy 
(DOE) is making papers 
written by the researchers it 
funds freely available, it said 

on4 August. A DOE online 
portal will link to peer- 
reviewed manuscripts and 
final-text journal papers within 
12 months of their publication. 
Up to 30,000 studies are 
expected to be made available 
annually. The department is 
the first US federal agency to 
respond to orders for public 
access and data-sharing issued 
by the government 18 months 
ago. See go.nature.com/jp50wp 
for more. 


Stellar survey 

The European Space Agency’s 
Gaia space telescope is ready 
to begin its five-year survey 
of about 1 billion stars in 

the Milky Way, the agency 
said on 29 July. Gaia will 
produce the most accurate 
three-dimensional map of 
the Galaxy yet. As it orbits the 
Sun, it will measure distances 
to stars by recording tiny shifts 


Anti-gay law overturned on technicality 


Uganda's anti-gay law, which punishes some 
homosexual behaviour with life in prison, was 
nullified by a court in Kampala on 1 August. 
The move could help efforts to study HIV 
transmission and control the spread of the 
virus. Gay people in Uganda and other 
African nations are frequently unable to access 


in their positions, and will 
observe the stars movement 
through space. The telescope 
launched on 19 December but 
operations were delayed by 
light leaking into the detector, 
where it could have degraded 
observations of the faintest 
targets. 


EVENTS 


Hacker attack 


Information technology (IT) 
systems at Canada’s National 
Research Council have come 
under a cyberattack, the 
agency said on 29 July. The 
Canadian government blamed 
the intrusion on Chinese 
state-sponsored hackers. No 
details were provided about 
what data were accessed, but 
the council told scientists 

to expect disruptions. It 
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said it is overhauling its IT 
infrastructure and security, 
including integrating the 
system with the broader 
government network to help 
protect against future attacks. 
This work could take around 
one year to complete. 


Ebola emergency 
Officials are stepping up 
efforts to contain the West 
African Ebola outbreak, 
which had killed 887 people 
as of 1 August. The president 
of Sierra Leone declared a 
state of emergency on 30 July, 
allowing the police and 
military to quarantine infected 
homes and villages. The 
World Health Organization 
(WHO) also announced a 
US$100-million plan to boost 
the number of emergency- 
response staff and scheduled 
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information on HIV, and those who become 
infected are often denied treatment (see Nature 
509, 274-275; 2014). The court ruled that the 
Anti-Homosexuality Act was invalid because 
too few members of parliament voted. It is not 
yet clear whether Uganda's government will 
appeal the ruling. 


a meeting for 6 August to 
discuss the international 
implications of the epidemic. 


FACILITIES 


Space instruments 
NASA’s Mars 2020 rover 
mission will carry seven 
instruments to collect rocks 
for transport back to Earth. 

On 31 July NASA announced 
the winning instruments, 
which were chosen from 58 
competitors. They includea 
zoomable camera, a machine 
to generate oxygen from 
carbon dioxide, and radar to 
explore geology up to halfa 
kilometre deep. On 30 July, 
NASA also announced that the 
International Space Station will 
get two new instruments to 
observe how changes in Earth’s 
climate and land use affect 


ISAAC KASAMANI/AFP/GETTY 


« how forests and ecosystems 
5 function. See go.nature.com/ 
% zrbun7 for more. 


HAN 


Neutrino detector 


An international collaboration 
to build a neutrino detector 

in China was announced 

on 30 July in Beijing. The 
Jiangmen Underground 
Neutrino Observatory (JUNO) 
project, led by the Institute 

of High Energy Physics in 
Beijing, will bring together 
researchers from countries 
including France, Russia and 
the United States. The detector 
will study neutrinos coming 
from supernovae, Earth and 
nearby nuclear reactors. It will 
be the world’s largest liquid 
scintillator detector, which 
captures luminescence when a 
neutrino interacts with atomic 
nuclei in the liquid, and aims 
to give the first measurement 
of the relative masses of the 
three known types of neutrino. 
The facility will be completed 
by 2020. 


SSS iia 
Stem-cell suicide 


One of Japan’s top stem-cell 
researchers, Yoshiki Sasai 
(pictured), died on 5 August 
in an apparent suicide. The 
52-year-old researcher, who 
worked at the RIKEN Center 
for Developmental Biology 
in Kobe, was best known for 
coaxing embryonic stem cells 
to differentiate into various 


shifted from extreme to 


TREND WATCH 


More than 20% of California 


SOURCE: US DROUGHT MONITOR 


exceptional drought — the most 
severe category — in the week 
up to 29 July. An update from 
the US Drought Monitor shows 
that exceptional drought now 
affects more than half of the 
state (see map), with more than 
80% classified as under extreme 
drought or worse. California is 
short of more than a year’s worth 
of reservoir water. “We wouldn't 
have expected it to be this dry,” 
says a spokesman for the US 
Department of Agriculture. 


types of mature cells. In 

2011, he stunned the world 
by mimicking an early stage 
in the development of the 

eye in vitro using embryonic 
stem cells. But over the 

past six months he became 
caught up in the controversy 
surrounding two Nature 
papers that claimed embryonic 
stem cells could be created 
through a method called 
stimulus-triggered acquisition 
of pluripotency (STAP). 

The papers were retracted 

on 2 July after evidence of 
misconduct was found. Sasai, 
a co-author, was cleared of 
direct involvement but was 
criticized for poor oversight 
of research. “The world 
scientific community has lost 
an irreplaceable scientist,” 
said RIKEN president Ryoji 
Noyori. See go.nature.com/ 
etrboi for more. 


Director resigns 
The director of the US 
National Institute of 
Neurological Disorders and 


CALIFORNIA 
DROUGHT 


58% of the state 
is now under 
conditions of 
‘exceptional 
drought’. 


Drought intensity 
in California 

» Severe 
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Stroke (NINDS) is stepping 
down, the institute announced 
on 31 July. Story Landis, who 
has headed NINDS since 
2003, had prominent roles 

in programmes such as the 
BRAIN Initiative and the 
National Institutes of Health’s 
programme to improve the 
reproducibility of science. 
Landis leaves at the end of 
September, when the deputy 
director of NINDS, neurologist 
Walter Koroshetz, will take 
over as acting director. 


Test clampdown 


Hospitals and laboratories in 
the United States will soon no 
longer be able to design their 
own diagnostic tests without 
input from the US Food 

and Drug Administration 
(FDA). On 31 July, the FDA 
announced that it will regulate 
the development of diagnostic 
tests for various diseases, as 
well as genetic tests used to 
identify patients who may 
react to certain treatments. The 
regulations will be phased in 
over the next nine years and 
will prioritize tests for which an 
incorrect diagnosis could result 
in significant harm toa patient. 
See page 5 for more. 


Lab safety 

Laboratories working with 
hazardous chemicals must 
develop a culture of safety 
rather than just relying on 
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SEVEN DAYS | THIS WEEK | 


10-14 AUGUST 

The American Chemical 
Society autumn meeting 
in San Francisco, 
California, has the 
theme ‘chemistry and 
global stewardship. 
go.nature.com/c2to6u 


10-15 AUGUST 

The Ecological Society 
of America conference 
in Sacramento, 
California, includes how 
fire affects ecology as a 
key topic for discussion. 
www.esa.org/am 


compliance with regulations, 
says a report released on 

31 July by the US National 
Research Council. The 
report was motivated by a 
series of recent high-profile 
accidents in university labs 
(see Nature 493, 9-10; 2013). 
It recommends better training, 
proactive analysis of hazards 
and rewards for researchers 
who take precautionary 
measures. 


Biotech job cuts 
Biotechnology firm Amgen, 
of Thousand Oaks, California, 
announced on 29 July that 

it will cut up to 2,900 jobs 

— around 15% of its global 
workforce. The company said 
it could not yet specify how 
many of the lost jobs will be 

in research and development. 
But it did say that it will close 
facilities in Washington state 
and Colorado, which include 
research and manufacturing 
sites. The cuts will begin later 
this year and follow weak sales 
of the company’s anaemia drug 
Aranesp (darbepoetin alfa). 
Amgen spent US$979 million 
on research and development 
in the second quarter of this 
year, nearly 19% of its sales 
revenue. 
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roll-out helps to stem tide of 
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The massive binary-star system n Carinae resembles the first stars that formed in the early Universe. 


Binary Star to spill 


celestial secrets 


Close approach and violent interaction of stars in n Carinae 
system will provide rare insight into stellar enigma. 


BY ALEXANDRA WITZE 


fter centuries of perplexing scientists 
At its wildly erratic behaviour, a 

nearby star may give up some of its 
secrets in the next couple of weeks. 


A binary system, n Carinae has two stars that 
swing past one another every 5.5 years. The 


bigger star — some 90 times the mass of the 
Sun — is incredibly unstable, always seemingly 
on the verge of blowing up. When the smaller 
companion star makes its closest approach 
to the primary star, as is happening now, the 
interaction between the two triggers violent 
changes in the high-energy radiation pouring 
out of the system. 
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Astronomers are watching the show in the 
hope of learning what drives this enigmatic 
system. In the 1840s, n Carinae had a mysteri- 
ous eruption; in recent decades, it has again 
brightened unexpectedly (see ‘Stellar show’). 
“The star is in an awfully deranged state, and 
no one knows why,’ says Kris Davidson, an 
astronomer at the University of Minnesota in 
Minneapolis. 

Some answers may come in the next few 
weeks. Theoretical work suggests that when 
1 Carinae’s secondary star passes by, its fast 
stellar winds bore a huge hole into the outer 
layers of the primary star (T. I. Madura et al. 
Mon. Not. R. Astron. Soc. 436, 3820-3855; 
2013). Astronomers expect that if they are 
right about this, then a specific series of events 
will unfold this month — including a quick rise 
in the system’s X-ray production after a drop 
that began in mid-July. 

Studying n Carinae has implications far 
beyond understanding one peculiar celestial 
system. Uncovering its secrets could help 
researchers to better understand the earliest 
stars that winked into existence. n Carinae is 
similar in mass to the first stars that formed 
in the Universe, billions of years ago. Most of 
today’s stars are much lighter, so n Carinae is 
a rare modern example of how such a massive 
star might operate — at the highly observable 
distance of 2,300 parsecs from Earth. 

Across the Southern Hemisphere, profes- 
sional and amateur astronomers are pointing 
their telescopes at the star, in the constella- 
tion Carina. “It’s the biggest effort ever,’ says 
Theodore Gull, an astrophysicist at NASA’s 
Goddard Space Flight Center in Greenbelt, 
Maryland. 

No one knows exactly when or how 
y Carinae’s companion will make its closest 
approach, but by mid-August it is likely to 
pass the primary at a distance equivalent to 
that between Mars and the Sun. Both stars in 
1 Carinae emit powerful stellar winds, which 
at close range collide, producing a ‘bow shock’ 
like that seen in front of ships. The mutual tan- 
gle sets off a sequence of bizarre events. 

The stars began brightening in the visible 
part of the electromagnetic spectrum in April, 
and then again in a sharper peak beginning in 
mid-June — probably as the companion star 
approached and began interacting with the 
primary star’s winds, says Eduardo Fernandez 
Lajus, an astronomer at the National Univer- 
sity of La Plata in Argentina. The system's 
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IN FOCUS 


STELLAR SHOW 


Scientists have struggled to explain the erratic 
behaviour of the binary star n Carinae, which 
brightened unexpectedly in the 1840s and 
again more recently. 
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X-ray production peaked in mid-July 
and has since plummeted to near zero — 
probably as the colliding winds, where 
the X-rays are born, have become entirely 
unstable and collapsed. 

The Hubble Space Telescope and other 
instruments are also tracking dramatic 
changes in the chemical-element signatures 
found in n Carinae’s light spectrum. The 
interaction between the two approaching 
stars can strip electrons from elements such 
as iron and helium, ionizing them more 
strongly than in normal celestial environ- 
ments. “You have these bare helium nuclei 
— that’s awfully hard to make in normal 
circumstances,’ says Gull. Watching this 
process over time helps to reveal how the 
stellar winds are interacting. 

At the Pico dos Dias Observatory in 
southern Brazil, astronomer Augusto 
Damineli has been spending every night 
since 25 July trying to catch a glimpse of 
1 Carinae through the winter clouds. On 
29 July, his team finally caught a brief open- 
ing and managed to gather data showing that 
a helium spectral line is dropping in just the 
pattern that Damineli expected. “TOUCH- 
DOWN!” he wrote in an e-mail. 

In 2009, when n Carinae had its most 
recent close encounter, the system’s X-ray 
production plunged and then shot back up 
in half the time it did in 2003. That could be 
because the primary star’s winds are slow- 
ing down, so it takes less time for the whole 
system to recover. If the wind speeds have 
continued to drop, X-ray emissions might 
shoot up even faster than last time. 

Seeing such big differences from one 
close encounter to the next is “what every- 
one is waiting for’, says Andrea Mehner, 
an astronomer at the European Southern 
Observatory in Santiago, Chile, who is 
monitoring n Carinae with Hubble. “We 
cannot make the star do something excit- 
ing if it doesn’t want to.” m 
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A nurse prepares to immunize a young child with pneumococcal and 


Ss 


rotavirus vaccines in Ghana. 


Hidden bonus 
from vaccination 


Immunization against pneumococcus in Africa also reduces 


levels of antibiotic resistance. 


BY EWEN CALLAWAY 


his summer, Eritrea, Cote d'Ivoire and 

Niger will join a growing list of coun- 

tries where infants receive a vaccine 
to prevent pneumonia, meningitis and other 
deadly diseases caused by the pneumococ- 
cus bacterium (Streptococcus pneumoniae). 
Pneumonia is a leading killer of young chil- 
dren in low-income countries; vaccinations 
from 2010 to the end of this year are estimated 
to have averted 500,000 deaths, according to 
the GAVI Alliance in Geneva, an international 
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organization that facilitates vaccination. 

Data from South Africa also point to another 
benefit of vaccination: stemming a rising tide 
of antibiotic resistance in the developing 
world. The country’s introduction of a pneu- 
mococcal conjugate vaccine (PCV) in 2009 has 
not only reduced the overall incidence of inva- 
sive pneumococcal disease by about two-thirds 
in infants (the age group vaccinated) and in 
adults, but has also reduced penicillin-resistant 
infections in both groups. 

This is the first time such benefits have 
been observed outside the developed world. 


ALFREDO CALIZ/PANOS 


SOURCE: GAVI ALLIANCE 


SCIENCE 345, 558-562 (2014) 


The data should spur public- 
health officials in low-income 
countries that have not yet 
adopted the vaccine to start using 
it, says Anne von Gottberg, a clini- 
cal microbiologist at the National 
Institute for Communicable Dis- 
eases in Johannesburg and leader 
of the study (see ‘Protecting chil- 
dren’). Her group has reported 
the results at conferences but they 
have not yet been published. 

The problem of antibiotic 
resistance is particularly stark in 
low-income countries, where over- 
prescription and poor regulation 
combine with a higher disease bur- 
den and poor sanitation to increase 
the use of antimicrobial drugs. A 
recent survey by the World Health 
Organization found rates of resist- 
ance in Klebsiella pneumoniae as 
high as 54%. Reduced susceptibil- 
ity of Streptococcus pneumoniae 
to penicillin was found world- 
wide, and topped 50% in some 
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PROTECTING CHILDREN 


Vaccination against the pneumococcus, a leading cause of death in young 
children, is gaining ground in Africa. 


reports. 

In North America, Europe and 
other well-off parts of the world, 
the introduction of pneumococ- 
cal vaccines in the early 2000s reduced cases 
of invasive pneumococcal disease by more 
than one-third in vaccinated children and in 
unvaccinated adults, who typically acquire 
infections from children. The vaccine also 
reduced the numbers of serious pneumococ- 
cal infections that were resistant to front-line 
antibiotics such as penicillin. 

Between 1998 and 2008, a study in the 
United States found a 64% decrease in antibi- 
otic-resistant pneumococci among children 
and a 45% decrease among adults over 65 (L. 
M. Hampton et al. J. Infect. Dis. 205, 401-411; 
2012). The different pneumococcal vaccines 
target a handful (7, 10 or 13, depending on 
the vaccine) of the more than 90 varieties 
(serotypes) of the pneumococcus, but those 
serotypes are among the most likely to 
develop antibiotic resistance. The result is 
a greater reduction in antibiotic-resistant 
strains in the population compared with 
sensitive strains. 

Low-income countries began deploying 
pneumococcal vaccines around 2009, and 
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more than 40 of them are expected to admin- 
ister the vaccine to infants by 2015. Many of 
these countries receive vaccines free or at a 
discount through financial support from the 
GAVI Alliance. “Getting those vaccines into 
low-income countries in a ten-year time span 
is an incredibly fast roll-out,” says Kather- 
ine O’Brien, an epidemiologist at the Johns 
Hopkins Bloomberg School of Public Health 
in Baltimore, Maryland. She notes that most 
other vaccines have reached the world’s poor- 
est people much later after being introduced in 
developed countries. 

South Africa, which is not eligible for 
GAVI support, funded its own pneumococ- 
cal vaccination programme in 2009 using 
the seven-strain PCV7, and von Gottberg’s 
team tracked the effects. A different mix of 
pneumococcus serotypes circulates in South 
Africa from those in the Western countries 
for which the PCV7 vaccine was designed. 
And the types of people who contract pneu- 
mococcal disease also differ in South Africa, 
where HIV-infected adults, often mothers, 
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tend to contract the disease along 
with children and older people. 

Despite these differences, von 
Gottberg’s team noticed a steep 
reduction in rates of invasive pneu- 
mococcal disease after vaccination 
began. By 2012, cases caused by the 
serotypes included in the vaccine 
had plummeted in both children 
and in middle-aged adults (by 
89% and 57%, respectively), and 
antibiotic-resistant infections with 
those serotypes were also down in 
the young (by 82%) and across all 
age groups. The results suggest 
that the vaccine’s effects in West- 
ern countries translated to chil- 
dren and adults in a sub-Saharan 
African country. It is notable, von 
Gottberg says, that the incidence of 
drug-resistant cases fell more than 
the incidence of drug-sensitive 
ones. 

O’Brien says that the data sug- 
gest that other Sub-Saharan Afri- 
can countries should experience 
similar benefits after introducing 
the vaccine. Ongoing surveillance 
of the effects of the PCV10 vac- 
cine roll-out in 2010 ina district in 
Kenya has also recorded steep reductions in 
severe pneumococcal infections (see go.nature. 
com/bxhé6qp). 

The reductions in antibiotic-resistant 
pneumococcal disease should lead to less 
use of antibiotics overall. A study in Finland 
found that the introduction of a different 
pneumococcal vaccine reduced antibiotic 
purchases by 8% (A. A. Palmu et al. Lan- 
cet Infect. Dis. 14, 205-212; 2014). More- 
over, if doctors are confident that front-line 
antibiotics will routinely cure serious pneu- 
mococcal diseases, “they won't feel that impe- 
tus to have to go with a big-gun antibiotic’, 
says O’Brien. More prudent use of antibiot- 
ics should forestall the development of drug 
resistance. 

Von Gottberg hopes that her team’s data will 
encourage countries that haven't yet signed 
up for the vaccine to follow Eritrea, Niger and 
Céte d'Ivoire. “To have another weapon in 
our armament to reduce antibiotic resistance 
is avery good story,’ she says. “It will convince 
governments.” m 
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Lars Bjornshauge set up the Directory of Open Access Journals in 2003. 


Open-access 
website gets tough 


Leading directory tightens listing criteria to weed out 


rogue journals. 


BY RICHARD VAN NOORDEN 


hen Lars Bjornshauge founded a 
website to index open-access jour- 
nals in 2003, just 300 titles made 


the list. But over the next decade, the open- 
access publishing market exploded, and Bjorn- 
shauge’s Directory of Open Access Journals 
(DOAJ) along with it. Today the DOAJ com- 
prises almost 10,000 journals — and its main 
problem is not finding new publications to 
include, but keeping the dodgy operators out. 
Now, following criticism of its quality-control 
checks, the website is asking all of the journals 
in its directory to reapply on the basis of stricter 
criteria. It hopes the move will weed out ‘preda- 
tory journals’: those that profess to publish 
research openly, often charging fees, but that are 
either outright scams or do not provide the ser- 
vices a scientist would expect, such as a minimal 
standard of peer review or permanent archiving. 
“We all know there has been a lot of fuss about 
questionable publishers,’ says Bjornshauge. 
The reapplication process will also create one 
of the largest ‘whitelists’ of acceptable open- 
access journals, helping the DOAJ to become 
amore useful tool for funders, librarians and 
researchers who want to look up information on 
a publication or import its metadata into their 


catalogues. Those journals meeting the highest 
criteria — expected to be about 10-15% of the 
total — will also be given a ‘seal’ of best practice. 

The DOAJ, which receives around 600,000 
page views a month, according to Bjornshauge, 
is already supposed to be filtered for quality. 
Buta study by Walt Crawford, a retired library 
systems analyst in Livermore, California, last 
month (see go.nature.com/z524co) found 
that the DOAJ currently includes some 900 
titles that are mentioned in a blacklist of 9,200 


STUNTED GROWTH 


The Directory of Open Access Journals grew 
rapidly — until it began culling low-quality 
publications last year. 
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potential predatory journals compiled by 
librarian Jeffrey Beall at the University of Colo- 
rado Denver (see Nature 495, 433-435; 2013). 
In addition, journalist John Bohannon last year 
proved that at least 73 journals in the DOAJ 
were suspect; in a sting operation, he sent them 
an obviously flawed paper which they then 
accepted for publication (J. Bohannon Science 
342, 60-65; 2013). The DOAJ removed the 
journals from its index. 

The DOAJ had the idea of introducing 
stricter standards a few years ago, says Alma 
Swan, co-founder of the non-profit company 
IS4OA, which now operates the DOAJ (previ- 
ously it was hosted by Lund University in Swe- 
den). “We need to show which journals come 
up toa minimum standard of quality,’ she says. 

Since May, would-be new members have 
had to fill in a tougher entry form containing 
more than 50 questions, which will now form 
the basis of the reapplication criteria. They 
include requests for information on a journal's 
digital archiving policy, its editorial board and 
its content licensing. “I suspect about 10% of 
journals on the list will not be able to pass the 
reapplication,’ says Bjornshauge. 

Paul Peters, the chief strategy officer at 
open-access publishers Hindawi, headquar- 
tered in Cairo, believes that the new criteria 
will be “incredibly important”. “Scholarly 
researchers need a way to determine whether 
a given journal is adhering to best practice, and 
I believe that the DOAJ can provide a trusted 
and scalable mechanism for doing so,’ he says. 

It is not clear whether the DOAJ’s whitelist 
will become the pre-emiment index of trust- 
worthy open-access journals. Beall says that the 
directory’s credibility has already been hurt and 
that its new approach is “too little, too late”. He 
is also not sure how the DOAJ will spot whena 
publisher is lying about its services. Moreover, 
Beall points out, many researchers and univer- 
sities will instead judge a journal's quality by 
whether it is indexed in major citation data- 
bases, such as Elsevier’s Scopus index, rather 
than looking at the DOAJ’s list. 

Bjornshauge says that a small cohort of 
some 30 voluntary associate editors — mainly 
librarians and PhD students — will check the 
information submitted in reapplications with 
the publishers, and there will be a second layer 
of checks from managing editors. He also finds 
it “extremely questionable to run blacklists of 
open-access publishers’, as Beall has done. 
(Crawford's study found that Beall’s apparently 
voluminous list includes many journals that 
are empty, dormant or publish fewer than 20 
articles each year, suggesting that the problem 
is not as bad as Beall says.) 

But will any kind of whitelist help vulnerable 
researchers to avoid publishing in substand- 
ard journals? Beall doesn’t think so. “There’s 
no evidence that the whitelist approach has 
been helpful in encouraging researchers not to 
become victims of scams,’ he says. “Bad open- 
access publishers are still growing like crazy.’ m 
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HEALTH CARE 


US big-data health network 
launches aspirin study 


PCORI clinical-research initiative will collect information on some 30 million people. 


BY SARA REARDON 


ne of the largest big-data experi- 
() ments in health care has set its first 

research target. The leaders of the 
Patient-Centered Outcomes Research Institute 
(PCORI) in Washington DC voted on 29 July 
to focus the institute's first clinical trial on the 
use of aspirin to prevent heart disease. The 
US$10-million pilot study will be conducted 
through PCORnet, a network set up by PCORI 
to collect health-care data such as insurance 
claims, blood tests and medical histories for as 
many as 30 million people in the United States. 

PCORnet is the latest — and largest — of a 
number of initiatives attempting to use patient 
data to improve health care. Private compa- 
nies such as Kaiser Permanente in Oakland, 
California, and medical networks such as 
Partners HealthCare in Boston, Massachu- 
setts, have similar aims. PCORI was estab- 
lished in 2010 with more than $3 billion from 
the US government to fund research compar- 
ing the effectiveness of treatments, and has 
so far given out $549 million in grants (see 
‘Network profile’). Its own project, PCORnet, 
comprises 29 smaller networks of US hospitals 
and patient groups. The aspirin trial is part ofa 
pilot phase to work out questions such as how 
best to recruit study volunteers, standardize 
their records and build rapport with them. 

In the second phase, scheduled to begin 
in September 2015, outside scientists will be 
able to mine PCORnet data for their research. 
“My measure of success will be the number 
of questions we get asked,” says Christopher 
Forrest, a paediatrician at Children’s Hospital 
of Philadelphia in Pennsylvania, who leads one 
of the networks participating in PCORnet. 

The aspirin trial is scheduled to launch in 
early 2015. Participants will take daily doses of 
aspirin that fall within the range typically pre- 
scribed for heart disease, and be monitored to 
determine whether one dosage works better 
than the others. “While this study is a proof of 
concept, [aspirin dosage] is a very important 
question as well,’ says PCORI director Joe Selby. 
“We're proving that you can ask and answer 
questions that matter to patients, and this one 
has gone many years without an answer.’ 

PCORI does not aim to test new potential 
treatments or uncover disease mechanisms — 
that falls within the purview of agencies such 


18 | NATURE | VOL 512 | 7 AUGUST 2014 


DATA POINTS 
Network profile 


The Patient-Centered Outcomes 
Research Institute (PCORI) is setting up 
a database of US health records that will 
be used to compare the effectiveness 

of different medical treatments. The 
system, PCORnet, will connect multiple 
smaller networks, giving researchers 
access to records at a large number of 
institutions without creating a central 
data repository. Here are some key facts 
about PCORnet and its parent institution: 


@ Initial funding for PCORnet: 

US$93.5 million 

@ Number of member networks: 29 

@ Total number of patient records: 

up to 30 million 

@ Budget for pilot study of aspirin dosage: 
$10 million 

@ Number of projects funded by PCORI 
since 2012: 313 

@ Amount of funding awarded by PCORI 
since 2012: $549 million 


as the US National Institutes of Health. “I see 
them as complementing, rather than replac- 
ing, randomized clinical trials,” says William 
Hersh, a biomedical informatician at Oregon 
Health & Science University in Portland who 
is not involved in PCORI. One of the great- 
est challenges will be standardizing data from 
different networks to enable accurate compari- 
son, he says. The many types of data — scans 
from medical imaging, vital-signs records 
and, eventually, genetic information — can 
be messy, and record-keeping systems vary 
between health-care institutions. 


ADVICE AND CONSENT 

Transparency in how patients’ records are used 
will be essential for the programme to succeed, 
says Sam Smith, an activist at medConfidential, 
a UK-based organization that campaigns for 
patient privacy protection. “Generally if you 
tell people where their data are going, they’re 
a lot happier than if you don't,’ he says. Trans- 
parency issues have plagued the UK National 
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Health Service’s care.data, a medical-records 
database whose launch has been delayed amid 
concerns about data security and informed 
consent. 

To avoid similar pitfalls, PCORI says that 
patient representatives will help to review 
its grant applications and work with inves- 
tigators on trials. But simplifying scientific 
concepts for non-specialists can be hard, says 
Glenn Cohen, a bioethicist at Harvard Uni- 
versity in Cambridge, Massachusetts. “It has 
a certain public spirit, but expecting patients 
to be involved raises challenges when these 
policy questions have a technical underlying 
element, he says. Selby says that PCORI plans 
to work out ways to deal with such issues over 
the next year. 

To ensure privacy, PCORnet will not collect 
personal data. Patients’ records will be kept 
by their health-care providers. When outside 
researchers use PCORnet for studies, the 
relevant data can be analysed within the net- 
work and the results sent to the researcher, or 
anonymized and provided in raw form. 

The aspirin study will use standard pro- 
cedures for obtaining informed consent, 
in which participants are told specifically 
about the research question their data will 
be used to address. But the process will be 
more complicated when outside researchers 
use PCORnet’s data. People may tire quickly 
of signing forms every time a researcher 
wants to use their records for a study. “If 
very explicit consent is required, cost and 
practicality are probably going to make it a 
non-starter,” Cohen says. The more transpar- 
ency, patient involvement and data security 
that PCORI can provide, he says, the more 
ethically sound it would be to forgo standard 
informed-consent procedures. m 


CORRECTIONS 

In the News story ‘Biosafety controls come 
under fire’ (Nature 511, 515-516; 2014) 
the ‘European Biosafety Organization’ 
should have been the ‘European Biosafety 
Association’. And the story ‘Project drills 
deep into coming quake’ (Nature 511, 
516-517; 2014) gave the old name for the 
Institute of Geological and Nuclear Sciences 
— it is now GNS Science. 
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The race is on to build a machine that can synthesize any 
organic compound. It could transform chemistry. 


BY MARK PEPLOW 


n faded photographs from the 1960s, organic-chemistry laboratories look like an 
alchemist’s paradise. Bottles of reagents line the shelves; glassware blooms from racks 
of wooden pegs; and scientists stoop over the bench as they busily build molecules. 

Fast-forward 50 years, and the scene has changed substantially. A lab in 2014 boasts 
a battery of fume cupboards and analytical instruments — and no one is smoking a 
pipe. But the essence of what researchers are doing is the same. Organic chemists typi- 
cally plan their work on paper, sketching hexagons and carbon chains on page after 
page as they think through the sequence of reactions they will need to make a given 
molecule. Then they try to follow that sequence by hand — painstakingly mixing, fil- 
tering and distilling, stitching together molecules as if they were embroidering quilts. 


ILLUSTRATION BY RYAN SNOOK 
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But a growing band of chemists is now trying to free the field from 
its artisanal roots by creating a device with the ability to fabricate any 
organic molecule automatically. “I would consider it entirely feasible to 
build a synthesis machine which could make any one ofa billion defined 
small molecules on demand,’ declares Richard Whitby, a chemist at the 
University of Southampton, UK. 

True, even a menu of one billion compounds would encompass just an 
infinitesimal fraction of the estimated 10° 
moderately sized carbon-based molecules 
that could possibly exist. But it would still 
be at least ten times the number of organic 
molecules that have ever been synthesized 
by humans. Such a device could thus offer 
an astonishing diversity of compounds for 
investigation by researchers developing 
drugs, agrochemicals or materials. 

“A synthesis machine would be transfor- 
mational,” says Tim Jamison, a chemist at 
the Massachusetts Institute of Technology 
(MIT) in Cambridge. “I can see challenges 
in every single area,’ he adds, “but I don't 
think it’s impossible”. 

A British project called Dial-a-Molecule 
is laying the groundwork. Led by Whitby, 
the £700,000 (US$1.2-million) project 
began in 2010 and currently runs until 
May 2015. So far, it has mostly focused on 
working out what components the machine 
would need, and building a collaboration 
of more than 450 researchers and 60 com- 
panies to help work on the idea. The hope, says Whitby, is that this 
launchpad will help team members to attract the long-term support 
they need to achieve the vision. 

Even if these efforts fall short, say project members, early work 
towards a synthesis machine could still transform chemistry. It could 
deliver a host of reactions that work as continuous processes, rather 
than one step at a time; algorithms that can predict the best way to 
knit a molecule together; and important advances in how computers 
tap vast storehouses of data about the reactivity and other properties 
of chemicals. Perhaps most importantly, it could trigger a cultural sea 
change by encouraging chemists to record and share many more data 
about the reactions they run every day. 

Some reckon it would take decades to develop an automated chemist 
as adept as a human — buta less capable, although still useful, device 
could be a lot closer. “With adequate funding, five years and we're done,” 
says Bartosz Grzybowski, a chemist at Northwestern University in Evan- 
ston, Illinois, who has ambitious plans for a synthesis machine of his own. 


ELECTRIC DREAMS 


If chemists are to have any hope of building their dream device, they 
must pull together three key capabilities. First, the machine must be 
able to access a database of existing knowledge about how molecules 
can be built — which reactions create bonds between carbon atoms, 
for example, or whether using certain reagents to construct one part of 
a molecule risks damaging other parts. Second, it must be able to feed 
this knowledge into an algorithm that can map out synthetic steps, in 
much the same way that a master chess player plans a series of moves to 
win a game. And finally, it must be able to automatically carry out that 
sequence using real reagents inside a robotic reactor. 

The technology for that last step has progressed the farthest. Many 
labs already own dedicated machines for churning out strands of DNA 
or polypeptides, and in the past decade, adaptable 
robot chemists have become increasingly impor- 
tant in commercial pharmaceutical research. But 
existing machines have limited capabilities: a 
DNA or protein sequence builder is typically able 
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to combine only a handful of molecular building blocks using fewer than 
half a dozen reactions. More versatile synthesis workstations are too 
expensive for most academic groups — costing from £30,000 to more 
than £500,000 — and still tend to produce molecules with a narrow 
range of chemical properties. 

These workstations also do most of their reactions in the same batch- 
by-batch manner as humans. But some chemists are trying to develop 
continuous-flow synthesis, in which reac- 
tions occur as the chemicals move through 
the machine. This can improve speed and 
yields, and is a lot more amenable to auto- 
mation. 

Jamison, for example, is working on flow 
chemistry at the Novartis-MIT Center 
for Continuous Manufacturing in Cam- 
bridge, and he is part ofa team that last year 
reported’ the first end-to-end, completely 
continuous synthesis and formulation of a 
pharmaceutical: aliskiren hemifumarate, a 
treatment for high blood pressure. Jamison 
and his colleagues built a machine (now 
dismantled) that was more than 7 metres 
long, and about 2.5 metres high and deep. 
“Tt took four years of ‘everything that can 


Trout, head of the MIT centre and leader 
of the project. After a lot of trial and error, 
he says, the researchers got to the point at 
which they merely had to flip the switch 
and feed in fresh drums of solvent and raw 
materials. The machine would hum like a large air-conditioning unit as 
stirrers whipped up chemicals, pumps whirred, filtration units dripped 
and squeezed, and a screw conveyer pushed solids through a 2-metre 
drying tube to be injection-moulded. Finally, after 14 operations and 
47 hours, finished tablets dropped down a chute. Batch synthesis would 
have required 21 operations over 300 hours. 

Jamison reckons that there is enormous potential for reactions to be 
adapted to continuous flow: “I think that it will be well over 50% eventu- 
ally, maybe even 75%” of all reactions, he says. Progress is accelerating, 
he adds, because fixing a problem in one step — solids clogging a pipe, 
say — can offer immediate improvements to other processes. 


A CHEMICAL BRAIN 


Although automated machines are growing more versatile, teaching a 
computer to devise its own synthesis remains a massive problem, says 
Yuichi Tateno, an automation researcher at pharmaceutical company 
GlaxoSmithKline in Stevenage, UK, and a member of the Dial-a-Molecule 
collaboration. “The hardware has always been there, but the software and 
data have let it down,” he says. 

Human chemists planning a synthesis tend to use a technique called 
retrosynthetic analysis. They draw the finished molecule and then pick 
it apart, erasing bonds that would be easy to form and leaving frag- 
ments of molecule that are stable or readily available. This allows them 
to identify the chemical jigsaw pieces they need as their raw materials, 
and to devise a strategy for connecting the pieces in the lab. If need be, 
they can seek inspiration from a commercial database such as SciFinder 
— an interface to the American Chemical Society’s Chemical Abstracts 
Service — or its main rival Reaxys, offered by publishing giant Elsevier. 
Entering a molecular structure or a reaction into these databases yields 
examples in the literature. But even with online help, says Tateno, 
humans often fail at synthesis. “With the amount of chemistry that’s 
out there, there's nobody who can know it all” 

The hope is that a synthesis machine could one day do much better, 
says Whitby, not least because computers are so much faster at scanning 
through terabytes of chemical data to find a specific reaction. The bigger 
challenge, he adds, is that computers have a much harder time figuring 
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out whether that reaction will actually work in a synthesis, particularly 
if the target has never been made before. 

That problem bedevilled Elias Corey, a chemist at Harvard University 
in Cambridge, Massachusetts, who formalized the rules of retrosynthe- 
sis in the 1960s. The following decade, Corey created software called 
LHASA (Logic and Heuristics Applied to Synthetic Analysis), which 
could use these rules to suggest sequences 
of steps towards a synthesis’. But LHASA 
and its successors have never taken off, 
says Grzybowski: either the databases have 
included too few reactions and too many 
errors, or the algorithms have not properly 
assessed whether proposed reactions are 
compatible with all functional groups in the 
molecule. “If we could just make one chemi- 
cal bond at a time, in isolation, chemistry 
would be trivial,” he says. 

Grzybowski has spent the past decade 
building a system called Chematica to 
address those problems. He started by cre- 
ating a searchable network of about 6 million 
organic compounds, connected bya similar 
number of reactions, drawn from one of the 
main databases behind Reaxys. His team 
then spent years cleaning up the data — 
identifying entries that lack crucial informa- 
tion about reagent compatibility or reaction 
conditions. Without that kind of clean-up, 
Chematica would be like a computer chef surveying a gigantic recipe book 
for dishes that use ice cream, stumbling on baked Alaska, and concluding 
that ice cream can withstand very high temperatures — missing the fact 
that cooking ice cream in an oven only works with an insulating shield of 
meringue. Chematica includes such crucial information, so its proposed 
syntheses of novel molecules — based on about 30,000 retrosynthetic 
rules — can be much more trustworthy. 

The team also designed Chematica to take a holistic view of synthesis: 
it not only hunts for the best reaction to use at each step, but also con- 
siders the efficiency of every possible synthetic route as a whole. This 
means that a poor yield in one step can be counterbalanced bya succes- 
sion of high-yielding reactions elsewhere in the sequence. “In 5 seconds 
we can screen 2 billion possible synthetic routes,” says Grzybowski. 


STRONGER, FASTER, CHEAPER 

When Grzybowski first unveiled the network behind Chematica in 2005 
(ref. 3), “people said it was bullshit’; he laughs. But that changed in 2012, 
when he and his team published a trio of landmark papers* ° showing 
Chematica in action. For example, the program discovered‘ a slew of ‘one 
pot syntheses in which reagents could be thrown into a vessel one after the 
other, without all the troublesome separation and purification of products 
after each step. The group tested Chematica’s suggestions for making a 
range of quinolines — structures commonly found in drugs and dyes — 
and showed that many were more efficient than conventional approaches. 

Chematica can also look up information about the cost of starting 
materials and estimate the labour involved in each reaction, allowing 
it to predict the cheapest route to a particular molecule. When Grzy- 
bowski’s lab tested 51 cut-price syntheses suggested by Chematica’, it 
collectively trimmed costs by more than 45%. 

These demonstrations have impressed synthetic chemists, although 
few have had a chance to test Chematica. That is because Grzybowski is 
hoping to commercialize the system: he is negotiating with Elsevier to 
incorporate the program into Reaxys, and is working with the pharma- 
ceutical industry to test Chematica’s synthesis suggestions for biologi- 
cally active, naturally occurring molecules. Grzybowski is also bidding 
for a grant from the Polish government, worth up to 7 million zloty 
(US$2.3 million), to use Chematica as the brain ofa synthesis machine 
that can prove itself by automatically planning and executing syntheses 
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of at least three important drug molecules. 

Others are doubtful that will happen — at least any time soon. For 
the foreseeable future, “there will always be a significant need for human 
intervention’, says Simon Tyler, commercial director of CatScl, a contract- 
research company in Cardiff, UK, that is involved in Dial-a-Molecule. 
“We wont have RoboCops wandering around in the lab” 

And as long as programmes like Chemat- 
ica rely on databases of published studies, 
says Whitby, they will struggle to design 
reliable synthetic routes to unknown com- 
pounds. To build a synthesis machine, “we 
need to be able to predict when a reaction is 
going to work — but more importantly we 
need to be able to predict when it’s going 
to fail”. 

Unfortunately, those failures are rarely 
recorded in the literature. “We only publish 
the successes, a cleaned-up version of what 
happens in the lab,’ says Whitby. “We also 
lose a lot of information: what really was the 
temperature, what was the stirring speed, 
how much solvent did you use?” 

One solution is to record those successes 
and failures using electronic laboratory 
notebooks (ELNs), computer systems for 
logging raw experimental data that are 
widely used in industry but still rare in aca- 
demia (see Nature 481, 430-431; 2012). “A 
lot of people ask, “Who reads all these data?’ The point is that machines 
use them — they can search the data,’ explains Mat Todd, a chemist at 
the University of Sydney in Australia. 

In principle, automated workstations and instruments could send 
information to an ELN, which would upload the details to an open- 
access database where they could help a synthesis machine to predict 
how reliable a reaction might be. “If we really did know the history of 
every chemical reaction that had ever been done, wed have amazing 
predictive capabilities,” says Todd. 

Dial-a-Molecule researchers have coordinated trials of ELNs in aca- 
demic labs; started to devise a standardized, machine-readable format for 
ELN records; and developed software that can push those data into open 
databases such as ChemSpider. Others in the network have developed 
prototype software called PatentEye, which could pull in extra data by 
scraping and cataloguing chemical information from patents. 

Many of those dreaming of a synthesis machine agree that widespread 
data harvesting will require a huge cultural shift. “That's absolutely the 
biggest barrier,’ says Todd. “In chemistry, we don't have that culture of 
sharing, and I think it’s got to change.” 

Money is also a significant hurdle. The expense of automated work- 
stations means that few academics are familiar with them or their poten- 
tial for capturing data. And with a large workforce of graduate students to 
draw on, academic labs often have little incentive to automate. Whitby is 
lobbying for a national centre that would host state-of-the-art automated 
synthesis equipment and software, to encourage their development and 
use. Until that materializes, he hopes that Dial-a-Molecule will inspire 
a new generation of chemists to embrace data sharing and automation. 

Grzybowski, for one, is convinced that the synthesis machine can 
become a reality: “The only thing that can kill it is scepticism? = 


Mark Peplow is a science journalist based in Cambridge, UK. 
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The aurora australis over the German Antarctic research base, Neumayer-Station III. 


Six priorities for 
Antarctic science 


Mahlon C. Kennicutt II, Steven L. Chown and colleagues outline the 
most pressing questions in southern polar research, and call for greater 
collaboration and environmental protection in the region. 


ntarctica. The word conjures up 
A images of mountains draped with 
glaciers, ferocious seas dotted with 
icebergs and iconic species found nowhere 
else. The continent includes about one- 
tenth of the planet’s land surface, nearly 
90% of Earth’s ice and about 70% of its 
fresh water. Its encircling ocean supports 
Patagonian toothfish and krill fisheries, 
and is crucial for regulating climate and the 
uptake of carbon dioxide by sea water. 
Antarctic scientists are unlocking the 


secrets of Earth’s climate, revealing lakes and 
mountains beneath the ice, exploring the deep 
sea and contemplating the origins of life and 
the Universe. Once seen as a desolate place 
frozen in time, Antarctica is now known to be 
experiencing relentless change. Local trans- 
formations such as the loss of ice, changes 
in ocean circulation and recovery of atmos- 
pheric ozone have global consequences — for 
climate, sea level, biodiversity and society. 

In April 2014, the Scientific Committee 
on Antarctic Research (SCAR) convened 
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75 scientists and policy-makers from 
22 countries to agree on the priorities for Ant- 
arctic research for the next two decades and 
beyond. This is the first time that the interna- 
tional Antarctic community has formulated a 
collective vision, through discussions, debate 
and voting. The SCAR Antarctic and South- 
ern Ocean Science Horizon Scan narrowed a 
list of hundreds of scientific questions to the 
80 most pressing ones (see Supplementary 
Information; go.nature.com/iilhsa). A full 
report will be published in August. 
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> Here we summarize the overarching 
scientific themes, and outline steps that 
researchers and governments must take to 
make this vision a reality. Securing fund- 
ing, as well as access to and protection for 
the region, will make greater international 
collaboration a necessity. 


SIX SCIENTIFIC PRIORITIES 

The questions identified fall broadly into six 
themes. To realize the full potential of Ant- 
arctic science we need to do the following. 


Define the global reach of the Antarctic 
atmosphere and Southern Ocean. Changes 
in Antarctica’s atmosphere alter the planet’s 
energy budgets, temperature gradients, and 
air chemistry and circulation. Too little is 
known about the underlying processes. 
How do interactions between the atmos- 
phere, ocean and ice control the rate of 
climate change? How does climate change 
at the pole influence tropical oceans and 
monsoons? How will the recovering ozone 
hole and rising greenhouse-gas concentra- 
tions affect regional and global atmospheric 
circulation and climate? 

The Southern Ocean has important roles 
in the Earth system. It connects the world’s 
oceans to form a global system of currents 
that transfers heat and CO, from the atmos- 
phere to the deep ocean. Nutrients carried 
north support the base of the ocean’s food 
web. The ocean is becoming more acidic 
as CO, dissolves in sea water, and cold 
southern waters will be the first to exhibit 
impacts. How will climate change alter the 
ocean's ability to absorb heat and CO, and 
to support ocean productivity? Will changes 
in the Southern Ocean result in feedbacks 
that accelerate or slow the pace of climate 
change? Why have the deepest waters of 
the Southern Ocean become warmer and 
fresher in the past four decades? 

Sea ice reflects and filters sunlight. It 
modulates how heat, momentum and gases 
exchange between the ocean and atmos- 
phere. Sea-ice formation and melt dictate 
the salt content of surface waters, affecting 
their density and freezing point. What fac- 
tors control Antarctic sea-ice seasonality, 
distribution and volume? We need to know. 


Understand how, where and why ice sheets 
lose mass. The Antarctic ice sheet contains 
about 26.5 million cubic kilometres of ice, 
enough to raise global sea levels by 60 metres 
if it returned to the ocean. Having been 
stable for several thousand years, the Ant- 
arctic ice sheet is now losing ice at an accel- 
erating pace’”. What controls this rate and 
the effect on sea level? Are there thresholds 
in atmospheric CO, concentrations beyond 
which ice sheets collapse and the seas rise 
dramatically? How do effects at the base of 
the ice sheet influence its flow, form and 
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response to warming? Water bodies beneath 
the thick ice sheet have barely been sampled, 
and their effect on ice flow is unknown. 


Reveal Antarctica’s history. Glimpses of the 
past from rock records collected around the 
continent's margins suggest that Antarctica 
might look markedly different in a warmer 
world. But rocks from the heart of the con- 

tinent and the sur- 


“Maximizing rounding oceans 
scientific have been only 
return while sparsely probed. 
minimizing Responses of the 
the human crust to, and the 
footprint should effects of volcanism 


and heat from Earth's 
interior on, overlying 
ice are largely undescribed. We know little 
about the structure of the Antarctic crust and 
mantle and how it influenced the creation 
and break-up of super-continents. Ancient 
landscapes beneath ice reveal the history of 
interactions between ice and the solid Earth. 
Geological signatures of past relative sea level 
will show when and where planetary ice has 
been gained or lost. We need more ice, rock 
and sediment records to know whether past 
climate states are fated to be repeated. 


be the goal.” 


Learn how Antarctic life evolved and 
survived. Antarctic ecosystems were long 
thought of as young, simple, species-poor 
and isolated. In the past decade a different pic- 
ture has emerged. Some taxa, such as marine 
worms (polychaetes) and crustaceans (iso- 
pods and amphipods) are highly diverse, and 
connections between species on the continent, 
neighbouring islands and the deep sea are 
greater than thought. Molecular studies reveal 
that nematodes, mites, midges and freshwater 
crustaceans survived past glaciations. 

To forecast responses to environmental 
change we need to learn how past events have 
driven diversifications and extinctions. What 
are the genomic, molecular and cellular bases 
of adaptation? How do rates of evolution in 
the Antarctic compare with elsewhere? Are 
there irreversible environmental thresholds? 
And which species respond first? 


Observe space and the Universe. The dry, 
cold and stable Antarctic atmosphere creates 
some of the best conditions on Earth for 
observing space. Lakes beneath Antarctic 
glaciers mimic conditions on Jupiter and 
Saturn's icy moons, and meteorites collected 
on the continent reveal how the Solar System 
formed and inform astrobiology. 

We have limited understanding of high- 
energy particles from solar flares that are 
funnelled to the poles along the Earth’s 
magnetic field lines. What is the risk of solar 
events disrupting global communications 
and power systems? Can we prepare for 
them and are they predictable? 
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Recognize and mitigate human influences. 
Forecasts of human activities and their 
impacts on the region are required for 
effective Antarctic governance and 
regulation. Natural and human impacts 
must be disentangled. How effective are 
current regulations in controlling access? 
How do global policies affect people’s moti- 
vations to visit the region? How will humans 
and pathogens affect and adapt to Antarc- 
tic environments? What is the current and 
potential value of Antarctic ecosystem 
services and how can they be preserved? 


CHALLENGING ENVIRONMENT 

Answering these many questions will require 
sustained and stable funding; access to all of 
Antarctica throughout the year; application 
of emerging technologies; strengthened pro- 
tection of the region; growth in international 
cooperation; and improved communication 
among all interested parties. 

Antarctic programmes are sensitive to 
budget uncertainties and disruptions. In 
the past year, US projects were deferred, 
delayed or reduced in scale because of the 
US government shutdown in October 2013. 
Other national programmes suffered budget 
cuts stemming from the economic slow- 
down. High fuel prices and diversions for a 
major search and rescue mission hindered 
some. Decades-long projects are difficult to 
sustain given short grant cycles. 

Postponed projects and lost field seasons 
leave gaps — a missing year of data for an 
ice-sheet study or biodiversity monitoring is 
irreplaceable. Faced with such uncertainties 
and hurdles, and with laboratories and stu- 
dents to support, some Antarctic researchers 
choose to leave the field. This also jeopard- 
izes the recruitment and retention of the 
next generation of researchers. 

Access to locations needed for science 
is limiting. Much of the continent and the 
Southern Ocean remain unexplored, and 
most scientists visit for a only few months 
each year. Researchers will need to develop 
autonomous vehicles and observatories that 
can reach remote locations such as beneath 
ice shelves, the deep sea and under ice sheets. 
Miniaturized sensors deployable on floats, 
animals and ice tethers must be able to 
acquire or transmit data for months or years. 

A wider range of satellite-borne sensors 
is needed to continuously observe the entire 
region. Expanded aircraft-based geophysi- 
cal surveys are needed to access the conti- 
nental interior and ice margins. Advanced 
biogeochemical and biological sensors will 
be crucial for establishing regional patterns. 
Databases and repositories that can handle 
vast quantities of genomic and biodiversity 
information will be essential. 

Future data sets will require high-speed 
and high-volume communications over great 
distances. Reliable sources of energy to power 
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Emperor penguins dive under a breathing hole in the Antarctic sea ice, which provides a platform for marine life. 


remote observatories and better ways to store 
and uplink data will be needed. Improved 
computer models are essential for portraying 
the highly interconnected Antarctic and 
Earth system if we are to improve forecasts. 

Antarctica’s environmental-protection 
measures must be strengthened**. More 
scientists will need to visit, and tourist num- 
bers have almost tripled in the past decade 
to more than 34,000 a year plus support 
personnel. This growth increases the risk 
of introducing non-indigenous species and 
the likelihood of fuel spills that we are ill- 
equipped to respond to effectively*”. 

The Antarctic Treaty System, which is 
responsible for governance of the region, is 
being tested by mounting environmental pres- 
sures and economic interests*®. The establish- 
ment of marine protected areas, international 
regulation of tourism, assessing financial pen- 
alties for environmental damage and regulat- 
ing bioprospecting have proved difficult to 
resolve. An integrated strategy for Antarctic 
environmental management is essential’, 

Antarctica is seen as a place to assert 
national interests®. In the past decade, 
countries including Belgium, China, the 
Czech Republic, India and South Korea 
have established new stations; Germany, 
the United Kingdom, the United States and 
others have replaced ageing ones; and Japan, 
South Korea and South Africa have built or 
replaced ice-capable ships. 

Yet scientists from many other nations 
lack access to Antarctica. Twenty-nine 
countries participate in decision-making 
and another twenty-one have agreed to 
abide by the Antarctic Treaty. Although this 


represents about two-thirds of the world 
population, it comprises less than one-sixth 
of the 193 member states of the United 
Nations — countries in Africa and the 
Middle East are notably under-represented. 


WORK TOGETHER 

Maximizing scientific return while mini- 
mizing the human footprint should be the 
goal. Coordinated international efforts that 
engage diverse stakeholders will be crucial. 

It is time for nations involved in southern 
polar research to embrace a renewed spirit 
of cooperation as espoused by the founders 
of the Antarctic Treaty — in actions not just 
words. Wider international partnerships, 
more coordination of science and infra- 
structure funding and expanded knowledge- 
sharing are essential. 

As an interdisciplinary scientific body, 
but not a funder of research, SCAR should 
assist with and encourage coordination and 
planning of joint projects, sharing of data 
and dissemination of knowledge to policy- 
makers and the public. SCAR should repeat 
the Horizon Scan exercise every four to six 
years and provide the outcomes to emerging 
integrated science, conservation and policy 
efforts** (see www.environments.aq). 

We urge the Antarctic Treaty and its 
Committee for Environmental Protection 
to expand use of scientific evidence in its 
decision-making and to apply state-of-the- 
art conservation measures judged on meas- 
urable outcomes’. 

Communicating the global importance of 
Antarctica to the public is a priority*. Nar- 
ratives must better explain how the region 
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affects and is influenced by our daily lives. 
Antarctic success stories, such as signs of 
ozone recovery, engender confidence in the 
power of changes in behaviour. 

Antarctic science is globally important. 
The southern polar community must act 
together if it is to address some of the most 
pressing issues facing society. m 
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BOOKS & ARTS 


Artistic alchemy 


Philip Ball unveils the scientific iconography in 
Albrecht Diirer’s enigmatic engraving Melencolia I. 


Ibrecht Diirer’s Melencolia I, 
Aver in 1514, seems an 
open invitation to the cryptolo- 
gist. Packed with occult symbolism from 


alchemy, astrology, mathematics and 
medicine, it promises hidden messages and 
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recondite meanings. What it really tells us, 
however, is that Diirer was a philosopher- 
artist of the same stamp as Leonardo da 
Vinci, immersed in the intellectual cur- 
rents of his time. In the words of art his- 
torian John Gage, Melencolia is “almost an 
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anthology of alchemical ideas about the 
structure of matter and the role of time”. 

Diirer’s brooding angel is surrounded 
by the instruments of the proto-scientist: 
a balance, an hourglass, measuring cali- 
pers, a crucible on a blazing fire. Here, too, 
is numerological symbolism in the ‘magic 
square’ of the integers 1-16, the rows, col- 
umns and main diagonals each adding up 
to 34: a common emblem of both folk and 
philosophical magic. Here is the astro- 
logical portent of a comet, streaming past 
an improbable rainbow, a symbol of the 
colour-changing processes of the alchemi- 
cal route to the philosopher’s stone. And 
here is the title itself: melancholy, associ- 
ated in ancient medicine with black bile, the 
same colour as the material with which the 
alchemist’s Great Work to make gold was 
supposed to begin. 

But why the tools of the craftsman 
— the woodworking implements in the 
foreground, the polygonal block of stone 
awaiting the sculptor’s hammer and chisel? 
Why the tormented, introspective eyes of the 
androgynous angel? 

Melencolia I is part of a trio of complex 
copperplate etchings that Diirer made in 
1513-14. Knownas the Master Engravings, 
they are considered collectively to raise 
this new art to an unprecedented standard 
of technical skill and psychological depth. 
This cluttered, virtuosic image is often 
said to represent a portrait of Diirer’s own 
artistic spirit. Melancholy, often consid- 
ered the least desirable of the four classical 
humours then believed to govern health 
and medicine, was traditionally associated 
with insanity. But during the Renaissance 
it was reinvented as the humour of artis- 
tic temperament, originating the popular 
link between madness and creative genius. 
The German physician Cornelius Agrippa, 
whose influential Occult Philosophy (writ- 
ten around 1510) Diirer is almost certain 
to have read, claimed that “celestial spirits” 
were apt to possess the melancholy man and 
imbue him with the imagination required of 
an “excellent painter”. 

The connection to Agrippa was first made 
in 1943, by the art historian Erwin Panofsky, 
doyen of symbolism in art. He argued that 
Diirer’s angel is vexed by the artist's sense 
of failure: an inability to fly, to exceed the 
bounds of the human imagination and create 
the truly wondrous. As a consequence, her 
tools lie abandoned. 

Why astronomy, geometry, meteorol- 
ogy and chemistry should have any relation 
to the artistic temperament is not obvious 
today. But in the early sixteenth century, 
the connection would have been taken for 
granted by anyone familiar with the Neo- 
platonic idea of correspondences in nature 
— the notion that all natural phenomena, 
including the predispositions of humanity, 
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are joined in a web of hidden forces and 
symbols. The humour melancholy, for 
instance, is governed by the planet Saturn, 
whence the word ‘saturnine’ 

So there would have been nothing obscure 
about this picture for its intended audience 
of intellectual connoisseurs. Diirer mas- 
tered and exploited the new technologies 
of printmaking, so he was able to distribute 
his works widely, and he indicated in his 
diaries that he sold many and gave others 
as gifts to friends and humanist scholars 
such as Erasmus of Rotterdam. Ferdinand 
Columbus, son of Christopher, collected 
more than 3,000 prints, 390 of them by 
Diirer and his workshop. 

Although the alchemical imagery of 
Melencolia I was part of this engraving’s 
‘occult parcel; it would be wrong to imagine 
that alchemy was, to Diirer and his contem- 
poraries, purely an esoteric art. As Lawrence 
Principe has argued in The Secrets of Alchemy 


(University of Chi- 
“There would cago Press, 2012), 
have been this precursor to 
nothing obscure sige cae was 
about this not only, or even 
picture for primarily, about 
its intended furtive and futile 
audience of attempts to make 
intellectual gold from base 
connoisseurs.” metals. It was also 


a practical craft, 
not least in pro- 
viding artists with their pigments. For this 
reason, artists commonly knew something 
of its techniques; Lucas Cranach the Elder, 
a friend of Diirer’s, was a pharmacist on the 
side, which may explain why he was almost 
unique in northern Europe in using the 
rare and highly poisonous yellow pigment 
orpiment, an arsenic sulphide. The extent of 
Direr’s chemical knowledge is not known, 
but he was one of the first artists to use acids 
for etching metal, a technique developed 
only at the start of the sixteenth century. 
The process required specialist knowledge: 
it typically used nitric acid, made from salt- 
petre, alum and ferrous sulphate, mixed with 
‘Dutch mordant’ composed of dilute hydro- 
chloric acid and potassium chlorate. 
Humility should perhaps compel us to 
concur with art historian Keith Moxey that 
“the significance of Melencolia I is ultimately 
and necessarily beyond our capacity to 
define” — we are too removed from it now 
for its themes to resonate. But what surely 
endures in this image is a reminder that for 
the Renaissance artist there was continuity 
between theories about the world, matter 
and human nature, the practical skills of the 
artisan, and the business of making art. m 


Philip Ball is a writer based in London. His 
latest book is Invisible. 
e-mail: p.ball@btinternet.com 


Books in brief 


A Message from Martha: The Extinction of the Passenger Pigeon 
and its Relevance Today 

Mark Avery BLOOMSBURY (2014) 

Acentury ago, Martha — the last of the passenger pigeons (Ectopistes 
migratorius) — died in a zoo in Cincinnati, Ohio. Less than 100 years 
before, the species had been Earth’s most common bird, darkening 
the sky and pulling down trees with the sheer size of its colonies. 

| Conservationist Mark Avery’s chronicle is based on science, historical 
f accounts, a 6,000-kilometre road trip to key US sites, and numerous 
y interviews. It offers a considered perspective on the habitat loss and 
unsustainable harvesting that led to the pigeon’s demise. 


Thrive: The Revolutionary Potential of Evidence-Based 

Psychological Therapies 

Richard Layard and David M. Clark ALLEN LANE (2014) 

‘ More than 350 million people worldwide have depression, estimates 
Ca Veil the World Health Organization. Yet mental illness is a policy 

— blind spot and access to treatment is poor — a “shocking form 

of discrimination”, say psychologist David Clark and economist 

Richard Layard. Drivers of the UK Improving Access to Psychological 

Therapies initiative, they make the case for tackling the burden now, 

y and draw up a road map for evidence-based therapies recommended 

by the UK National Institute for Health and Care Excellence. 


Extracted: How the Quest for Mineral Wealth Is Plundering 

the Planet 

Ugo Bardi CHELSEA GREEN (2014) 

Our dependence on fossil fuels and minerals is growing ever more 
costly, as extraction rapidly creams off the “easy” deposits. So argues 
physical chemist Ugo Bardi in this in-depth study of the issue, based 
ona report to the Club of Rome, a global think tank. Bardi examines 
depletion models, pollution and climate change, the viability of 
mining oceans or asteroids, and options such as waste recycling. If 
rampant extraction persists, he notes, we will not need spaceships to 
find a new world: we'll be standing on it, and it will not be pretty. 


Whatever Happened to the Metric System?: How America 

Kept Its Feet 

John Bemelmans Marciano BLOOMSBURY USA (2014) 

Miles and pounds are here to stay — in the United States, at least. Its 
measurement system survived an abortive federal metric initiative 
inthe 1970s and 1980s and, writer John Bemelmans Marciano 
reveals in this digressive history, a much earlier attempt. In 1790 
Thomas Jefferson hoped to follow up his decimal currency with a 
decimal system of weights and measures. Instead, it was US ally 
post-revolutionary France, burdened with a hideously complicated 
system of weights and measures, that authored the metric system. 


The Amazing World of Flyingfish 

Steve N. G. Howell PRINCETON UNIVERSITY PRESS (2014) 

The more than 60 species of the family Exocoetidae literally sail the 
seas — or, more accurately, just above them. The streamlined bodies 
of flying fish lend them underwater speed that they can convert into 
lift; winglike pectoral fins allow them to glide as far as 180 metres to 
escape predators. They can even use updraughts of air. Ornithologist 
Steve Howell’s engrossing natural history is embellished with 

90 superb colour photographs of the ornate goldwing and other 
beauties among these “ocean butterflies”. Barbara Kiser 
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Correspondence 


Private collections 
hold back science 


Christian Foth and colleagues 
describe the spectacular eleventh 
specimen of the earliest bird, 
Archaeopteryx, including 
plumage features previously 
unknown for this pivotal taxon 
(Nature 511, 79-82; 2014). 
We are concerned, however, 
about access to this important 
specimen after its long-term 
loan to a public museum expires, 
when it will be returned to a 
private collection. 

This Archaeopteryx has been 
registered under the ‘Act to 
Prevent the Exodus of German 


Cultural Property’ (see go.nature. 


com/xyk5lz), which requires 

its whereabouts to be recorded 
and prevents loss of German 
heritage. However, public access 
to such specimens remains at the 
owner's discretion. 

Many journals refuse to 
publish characteristics of 
specimens held in private 
collections because the 
observations cannot be 
independently verified, which 
is counterproductive for the 
scientific community. Inste ad, 
journals should insist on a 
guarantee of future access as a 
condition of publication. 

Paul M. Barrett, Martin C. 
Munt Natural History Museum, 
London, UK. 
p.barrett@nhm.ac.uk 


Smart tools boost 
mental-health care 


Portable electronic devices 
such as smart phones and 
virtual-reality headsets provide 
clinicians with a powerful tool 
for improving evidence-based 
psychological treatments (see 
Nature 511, 287-289; 2014). 
For example, they offer patients 
more-frequent access to 
therapy and are likely to boost 
compliance. 

Internet-based cognitive 
behavioural therapy, for 
instance, is effective for 
conditions such as depression 


and anxiety. Immersive virtual 
reality helps for some phobias, 
including fear of flying and fear 
of heights, and holds promise 
for conditions such as eating 
disorders and post-traumatic 
stress disorder (see G. Riva 
CyberPsychol. Behav. 8, 220- 
230; 2005). 

Smartphones, in conjunction 
with wearable sensors, convey 
information about patients’ 
activity, location and even 
physiological responses, 
providing insight into how 
well they function in everyday 
life and guiding decisions on 
mental-health interventions. 

Virtual reality could also be 
a useful tool for researchers 
investigating the neural basis of 
functional and dysfunctional 
psychological processes (see 
C. J. Bohil et al. Nature Rev. 
Neurosci. 12, 752-762; 2011). 
Andrea Gaggioli, Giuseppe 
Riva Catholic University; and 
Istituto Auxologico, Milan, Italy. 
andrea.gaggioli@unicatt. it 


Rules for assessing 
pain in lab animals 


New rules in the European Union 
directive for protecting laboratory 
animals call for a formal 
assessment of any pain that 

these animals might experience 
during scientific procedures (see 
go.nature.com/porg2x). 

An upcoming workshop, 
organized by Germany's Federal 
Institute for Risk Assessment, 
will evaluate national guidelines 
on pain assessment drawn up in 
a similar workshop last year, with 
particular reference to ‘harmful 
phenotypes’ — animal lines that 
are likely to experience pain as a 
result of genetic alteration (see 
go.nature.com/8dhjcz). 

Maintenance of such 
genetically altered animals, which 
carry the defining mutation at a 
single genetic locus, is currently 
subject to project authorization. 
This consent ensures that pain 
is monitored closely in these 
animals by veterinarians, 
animal-welfare officers, scientists 
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and laboratory staff who are 
responsible for animal care. 
Breeding more animals for the 
purpose of assessing pain severity 
is not permitted, and neither is 
subjecting them to further pain 
— as inflicted, for example, by 
routine blood testing. 
Participants in the next 
workshop, to be held in October, 
will collate experiences resulting 
from implementation of the 
2013 guidelines and will develop 
a database for pain-assessment 
results from genetically altered 
animals. 
Barbara Grune, Andreas 
Hensel Federal Institute for 
Risk Assessment (BfR), Berlin, 
Germany. 
Gilbert Schénfelder BfR; and 
Charité-Universitdtsmedizin 
Berlin, Germany. 
gilbert.schoenfelder@bfr.bund.de 


Early life is key to 
disease risk 


Lawrence O. Gostin calls 

for action to curb non- 
communicable diseases 
(Nature 511, 147-149; 2014). 
Interventions in early life can 
also make a big difference. 

There is now overwhelming 
evidence that factors in a child’s 
early environment, during 
periods of developmental 
plasticity — including in utero 
— are major risk determinants 
for non-communicable disease 
in later life (see, for example, 
R. Barouki et al. Environ. Health 
11, 42; 2012). 

We should heed 
recommendations, based on 
findings in the field known 
as developmental origins 
of health and disease, that 
preventive strategies during 
fetal development and early 
childhood will ultimately be 
more effective in reducing the 
long-term burden of disease. 
(see, for example, S. C. Davies 
et al. Lancet 382, 1383-1384; 
2013; and M. W. Gillman and 
D. S. Ludwig N. Engl. J. Med. 369, 
2173-2175; 2013). 

Jeffrey M. Craig Murdoch 
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Childrens Research Institute, 
The Royal Children’s Hospital, 
Parkville, Australia. 
jeff.craig@mcri.edu.au 

Susan Prescott School of 
Paediatrics and Child Health, 
University of Western Australia, 
Perth, Australia. 


More inclusivity in 
PhD admissions 


More PhD programmes in 
science and engineering 
should adopt the mentoring 
model developed by Casey 
Miller and Keivan Stassun to 
boost numbers of female and 
minorities students (see Nature 
510, 303-304; 2014). They 
should also use more-inclusive 
admissions criteria. 

Efforts to democratize and 
diversify US higher education are 
placing a growing emphasis on 
applicants’ social and economic 
origins (see, for example, 

W.G. Bowen et al. Equity and 
Excellence in American Higher 
Education Univ. Virginia Press, 
2005). I suggest that admissions 
measures be expanded to 
include applicants who are first- 
generation college students, for 
example, or who were raised in 
poverty or in deprived single- 
parent families. 

This broadened approach 
would fit with Miller and 
Stassun’s recommendation 
that admissions practices 
be augmented with “proven 
markers of achievement, such as 
grit and diligence”. 

Kenneth Oldfield University 
of Illinois at Springfield, Illinois, 
USA. 

koldf1@uis.edu 


CORRECTION 

In the Outlook article ‘Funding 
by numbers’ (Nature 511, 
$52-S53; 2014), the vertical 
axis of the graphic ‘Scholarly 
spending’ was wrongly 
labelled. It should have read 
‘Percentage of GDP spent on 
research and development’. 


BRIEF COMMUNICATIONS ARISING 


Bias towards large genes in autism 


ARISING FROM I. F. King et a/. Nature 501, 58-62 (2013); doi:10.1038/nature12504 


In an important recent paper, King et al.’ reported that inhibition of 
TOP1 and other topoisomerases reduces the expression of extremely 
long genes. They also showed that the list oflarge genes affected by TOP1 
inhibition is enriched with candidate genes for autism spectrum disor- 
ders (ASD); however, the list of candidate genes that was used contains 
many genes with limited evidence for association with ASD’. Here we 
demonstrate that the size of the genes among ASD candidate genes is 
biased towards extremely large genes only for genes identified to be dis- 
rupted by copy number variations (CNVs). Thus, our analysis suggests 
that the association between large genes and ASD is mainly driven by 
the method that implicated the genes in ASD. There is a Reply to this 
Brief Communication Arising by Zylka, M. J. et al. Nature 512, http:// 
dx.doi.org/10.1038/nature13584 (2014). 

The literature on ASD mentions many candidate genes, yet convinc- 
ing evidence was yielded only for few of them. This is reflected in the 
SFARI Gene database used by King et al.'. This database currently con- 
tains 588 genes, for which the evidence for association with ASD varies 
considerably. To address this concern, a gene scoring module was devel- 
oped in SFARI Gene 2.0 to estimate the evidence level for individual 
genes*. For example, Table 1 of King et al." lists 49 ASD candidate genes 
that were affected by Top] inhibitors, but only three of these genes are 
considered strong candidates, four genes have suggestive evidence, and 
an additional two genes are involved in syndromes associated with ASD. 
To retest the association between gene length and ASD, we first selected 
genes in the SFARI database that had a score of at least suggestive evi- 
dence. The gene with the strongest evidence, CHD8, is not particularly 
large (~50 kilobases), but the list also contains some of the largest genes 
in the genome like AUTS2 and NRXNI1 (~1,000 kb). The SFARI Gene 
database is based on studies that focused on specific genes and on genome- 
wide studies, which are considered to be unbiased. However, as we show 
below, even in genome-wide studies there are biases influenced by the 
type of study. 

The genome-wide search for ASD risk genes has been performed 
mainly by searching for rare and de novo variants, using two main meth- 
ods: microarrays to identify CNVs, and exome sequencing to identify 
single nucleotide variations (SNVs) predicted to alter the protein sequence. 
Under the assumption of uniform mutation rate, the probability for a 
coding SNV is not strongly related to the total size of the gene, which is 
mainly determined by the length of the introns and untranslated regions. 
However, this is not true for CNVs. Most CNVs identified in ASD are 
large and contain multiple genes**. Therefore, it is hard to associate any 
particular gene with ASD. In contrast, when a gene is extremely large 
there is a higher probability for it to become an ASD risk gene because it 
was the only gene affected by the CNV. Accordingly, we hypothesized 
that large genes will be associated with ASD mainly ifimplicated by CNVs. 

Following the above hypothesis, we divided the ASD genes from the 
SFARI Gene database into two groups on the basis of the mutation type 
that implicated the gene in ASD: genes affected by CNVs or by SNVs. 
We plotted the distribution of gene sizes separately for each group. As 
can be seen in Fig. 1a, the distributions were notably different. “Genes with 
CNVs” showed bimodal distribution of short and large genes, whereas 
“genes with SNVs” were relatively short. 

To further study the association between gene size and mutation types, 
we focused on six studies that searched for de novo CNVs and SNVs, 
mostly in the same cohort (the Simons Simplex Collection)**. We com- 
pared the distribution of gene lengths between all coding genes in the 
genome, brain-specific genes, and genes with de novo SNVs or CNVs, in 
both ASD probands and unaffected siblings (Fig. 1b). Consistent with 
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Figure 1 | Association between gene size and mutation types. a, Density 
plots of gene length for ASD genes with evidence for association according to 
the SFARI Gene database. Genes were divided to two groups on the basis of the 
type of mutations that implicated the gene in ASD, CNVs or SNVs. b, The 
distribution of gene length is presented by box plots for all genes in the genome, 
brain-specific genes and genes with de novo SNVs or CNVs identified by recent 
genome-wide studies. NS, non-significant; **P < 0.01; ***P < 0.001. 


King et al.', and as was previously reported’, we found that brain-specific 
genes are significantly larger than the average gene in the genome (P = 
6 X 10 ~”). Whereas genes identified to be disrupted by de novo SNVs 
in ASD hada similar distribution of sizes as brain-specific genes (P = 1), 
the size of genes with de novo CNVs were significantly larger than either 
group (P<3 x 10 7,P<5X10, respectively). Furthermore, there 
was no difference in gene size between affected versus unaffected chil- 
dren for both de novo CNVs and SNVs (P = 1). 

In summary, our analysis suggests that the association between large 
genes and ASD that was observed by King et al.’ stems mainly from the 
method of implicating genes based on CNVs, and is not an inherent prop- 
erty of ASD risk genes. 


Methods 


The list of ASD genes was constructed based on the SFARI Gene database (accessed 
11 December 2013). We discarded genes with no or minimal support for association 
(score >3). Because of the difficulties to replicate genetic association and linkage 
results of ASD, candidate genes were considered only if the evidence was based on 
rare variants. The length of each gene was determined based on the largest transcript 
in the refSeq table of the UCSC Genome Browser (hg19 assembly). Genes were divided 
into CNVs or SNVs groups on the basis of the majority of studies that associated 
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BRIEF COMMUNICATIONS ARISING 


the gene with ASD. In addition, we studied the length of genes found to be disrupted 
by de novo SNVs (nonsense, frameshift or splice site mutations)**, and single genes 
affected by de novo CNVs**. To test for differences in the distribution of gene sizes 
in different groups we performed a Kruskal-Wallis rank sum test on all groups (using 
the kruskal.test function in The R Project for Statistical Computing), followed by 
a pair-wise Mann-Whitney- Wilcoxon Test (using the wilcox.test function in R). 
P values were adjusted for multiple tests using the Bonferroni correction. 
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REPLYING TO S. Shohat & S. Shifman Nature 512, http://dx.doi.org/10.1038/nature13583 (2014) 


Shohat and Shifman’s analysis’ indicates that long autism spectrum 
disorder (ASD) genes are overrepresented in the SFARI Gene/AutDB 
database (as of 11 December 2013) owing to the discovery method. We 
agree with their analysis and with the need to consider the strength of 
evidence behind each candidate gene. When our study was underway’, 
SFARI Gene provided the only comprehensive list of autism candidate 
genes with confidence values. Subsequent to our publication, more genes 
have been added to this database and scored, highlighting the rapid pace 
of advances in the ASD field and the changing confidence behind each 
ASD gene. 

Weagree with Shohat and Shifman that the proportion of ASD genes 
that are long may drop as more ASD genes are identified. We did not 
account for how the discovery method used to identify a given ASD can- 
didate, be it based on copy number variant (CNV) or single nucleotide 
variant (SNV), might affect average gene length in our study. However, 
it is also undeniable that many long genes are considered candidates in 
ASD pathology, such as NRXN1 and CNTNAP2. Moreover, our mech- 
anistic findings are not in dispute. Indeed, three other groups came to 
the same conclusion as us—that topoisomerases preferentially facilitate 
expression of long genes* °. Our study demonstrates an essential role for 
topoisomerases in transcriptional elongation oflong neuronal genes and 
suggests a critical role for these enzymes in neurodevelopmental disor- 
ders like autism. 

Shohat and Shifman’ also suggest that the SFARI Gene database con- 
tains many genes with weak links to ASD pathology. In our study, we did 
not rank genes as stronger or weaker ASD candidates, and treated all 
equally. However, when the degree of evidence behind each candidate 
is taken into account, using the gene scoring module in SFARI Gene 2.0° 
(as of 1 April 2014), it remains clear that numerous strong ASD can- 
didate genes are very long (>200 kilobases). Thus, we feel our conclusion 
linking topoisomerases and gene length with autism is still warranted, 
but this remains to be tested more rigorously pending in vivo studies 
with animal models and additional human genetic studies. 

Future studies are likely to validate additional long genes as strong 
ASD candidates. For example, NRXN3 and CNTNS (1.5 and 1.3 megabase, 
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respectively) are not yet scored in SFARI Gene, yet these genes are deleted 
in patients with ASD”* and both are significantly reduced in topotecan- 
treated neurons’. 

Ultimately, we agree that making conclusions about the nature of 
ASD genes is complicated by factors like the ones Shohat and Shifman 
describe’ as well as by the evolving knowledge of autism genetics. Regard- 
less, we identified a transcriptional mechanism that affects the expres- 
sion of long genes, a number of which are currently classified as strong 
ASD candidates. 
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NEWS & VIEWS 


Seeds of selective nanotube growth 


‘Seed’ molecules have been made that enable synthesis of just one kind of single-walled carbon nanotube, rather than a 
mixture of species. This paves the way for the preparation of pure samples of any nanotube species. SEE LETTER P.61 


JAMES M. TOUR 


he wonderful thing about single-walled 
ik nanotubes (SWCNTs) is that 

more than 100 species can be pro- 
duced by various growth methods. But this 
is also the most frustrating thing about them. 
It is expected that different nanotube species 
will have different applications, but the nano- 
tubes generally form as mixtures of around 
5-50 species from any preparation method’’. 
Moreover, separation methods are cumber- 
some because of the many species that can 
form. After many attempts over the past two 
decades to grow single species of SWCNTs, 
Sanchez-Valencia et al.’ present a route to 
success on page 61 of this issue. 

Each SWCNT species can be defined by a 
pair of integers (”,m) called chirality indices, 
which describe how a graphene sheet (a sin- 
gle layer of carbon atoms in graphite) would 
hypothetically be rolled up to generate a tubu- 
lar structure*. Chirality indices can be used 
to determine the two unique, fundamental 
parameters of each rolled-up graphene struc- 
ture — the tube diameter and the angle with 
respect to a plane perpendicular to the tube’s 
long axis at which the graphene would be rolled 
into a tube. The term ‘chiral’ is sometimes a 
misnomer, because chirality is a property asso- 
ciated with asymmetry, but some SWCNTSs are 
not asymmetric. 

Although there are many species of SWCNT 
(Fig. 1), there are only two main types: metal- 
lic nanotubes, which conduct electricity in the 
same way as gold or aluminium; and semi- 
conducting nanotubes, whose electrical con- 
ductivities are tunable, as in the semiconductors 
silicon and gallium arsenide. Conductivities are 
determined by a property called the bandgap 
— the smaller the bandgap, the larger the room- 
temperature conductivity. Metallic SWCNTs 
have a bandgap of 0 electronvolts (eV), whereas 
semiconducting nanotubes have a bandgap that 
can vary from approximately 1 meV to 1.5eV 
(ref. 5). Specific bandgaps are required for par- 
ticular applications. For example, a bandgap of 
OeV is desirable for electrical wire and cable 
applications, whereas a larger bandgap is pre- 
ferred for transistors. For photonic applications, 
different bandgaps are required to generate or 
detect different colours’. 
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Figure 1 | Structural diversity of single-walled 
carbon nanotubes (SWCNTs). The orientations 
of the hexagonal rings of atoms in SWCNTs form 
the basis of three classes of nanotube, examples 
of which are shown here. Sanchez- Valencia et al.* 
report a method for making an ‘armchair’ variety 
of SWCNT known asa (6,6) tube, starting from 
molecular ‘seeds’ on a catalytic platinum surface. 
The tubes form as a single product without 
contamination from zigzag or chiral nanotubes. 
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Sanchez- Valencia and co-workers pre- 
pared exclusively (6,6) species of tubes 
starting from predefined ‘seeds’ — organic 
molecules prepared by multi-step synthe- 
sis. They grew SWCNTs from each of these 
seeds on a platinum surface at 500°C, using 
ethanol as a source of carbon atoms. The idea 
of using molecules to control the chirality of 
nanotubes is not new’, but the authors have 
taken the concept of seed design for specific 
nanotube growth to an extraordinary level: the 
precise arrangement of the atoms in the seed 
predefines the species of tube grown. Their 
work suggests that it should be possible to 
design and synthesize seeds for any desired 
SWCNT species. 

Impressively, the researchers used scanning 
tunnelling microscopy to image the orienta- 
tion of the seeds on the platinum surface and 
to take snapshots during the main phases 
of the nanotube-forming process — that is, 
the formation of a bowl-shaped cap from 
the seed and the subsequent “‘base-growth’ 
stages (in which the catalytic platinum atoms 
stay at the substrate surface and the top of 
the nanotube is catalyst-free). They also 
studied their nanotubes using Raman spec- 
troscopy, observing a single peak in a band 
of the Raman spectrum that is diagnostic of 
the species of nanotube being analysed. This 
provided beautifully simple confirmation that 
only one species of nanotube forms from the 
seeds, and unambiguously pinpointed the 
nanotube structure. Furthermore, the authors 
performed extensive computational model- 
ling to understand the different phases of the 
nanotube-forming process. 

Sanchez- Valencia and co-workers’ method 
is currently the only one that allows predictable 
control of the chirality of SWCNTs. In another 
approach reported’ this year, (12,6) SWCNTs 
were prepared as 92% of the nanotube mix- 
ture using a solid alloy catalyst, but the tube 
species that grew could not be predetermined. 
The relatively low temperature (500°C) used in 
Sanchez- Valencia and colleagues’ procedure 
probably helps to maintain species specificity 
because, at higher temperatures, small vari- 
ations in the growth temperature can cause 
changes in the chirality index along a tube’. 

Some might view the need for a 10-step 
organic synthesis of the seeds as an 


overburdening limitation of the new approach. 
It is not. Consider that 1 mole of seeds is 
6x10” molecules, equating to 1.2 kilograms of 
material — a quantity that could easily be pre- 
pared by a chemical company. If, as Sanchez- 
Valencia et al. show, 50% of those seeds adopt 
the necessary conformation for growth at the 
platinum surface, then more than 5 tonnes 
of 10-micrometre-long SWCNTSs could be 
obtained from just 1 mole of seeds. 

However, further challenges remain. The 
new method produces nanotubes that stand 
perpendicular to the growth surface, like the 
bristles of a carpet. This minimizes entangle- 
ment of the nanotubes, but they will still form 
bundles when they reach a certain length. Many 
applications need SWCNTs to be unbundled, 
and so the nanotubes will need to be subse- 
quently treated with solvent or wrapped with 
polymers. Furthermore, the surface area 
covered by SWCNTs using typical growth 


methods is of the order of 1% (ref. 10); using 
Sanchez- Valencia and colleagues’ approach, 
about 30 km’ of platinum would be needed 
to accommodate 1 kg of seeds at this surface 
density, assuming that half of them grow. Plac- 
ing nanotubes in arrays precisely where they 
are needed has also been a persistent obstacle 
to the development of many devices. Lastly, it 
remains to be seen whether molecular seeds 
can be made that selectively control the forma- 
tion of other nanotube chiralities. 

Sanchez- Valencia and colleagues’ work 
represents a stellar breakthrough in the syn- 
thesis of SWCNTs. To those who have worked 
in this field for the past two decades, it is hum- 
bling to think that the selective growth of these 
diminutive structures has taken so long. But it 
is comforting to see it done so definitively. m 


James M. Tour is in the Departments of 
Chemistry and of Materials Science and 


Directions for 


the drivers 


Acomparison of colorectal cancer and normal cells from 103 patients identifies 
dozens of genes that are differently expressed in tumour cells as a result of altered 
regulation of transcription. SEE LETTER P.87 


GREG GIBSON 


ow important for cancer incidence 
H and progression is genetic variation 

that affects gene expression? This 
fundamental question has received remark- 
ably little attention in recent studies of cancer 
genomes’, perhaps because of a prevailing 
view that the cancer-causing mutations that 
can be targeted by drugs are those that disrupt 
protein structure”. On page 87 of this issue, 
Ongen et al.’ demonstrate how simultane- 
ous gene-expression profiling and whole- 
genome genotyping can be used to dissect the 
regulation of gene transcription in colorectal 
cancer’. The findings provide two thought- 
provoking insights: that cancer-driving 
changes may be identifiable among an excess 
of regulatory mutations, and that ‘cryptic’ 
regulatory genetic variation has the potential 
to modify cancer progression. 

It is well established that gene expression is 
altered in cancer. Despite their independent 
derivation, tumours of the same type tend to 
converge on a common, new gene-expression 
profile. Various studies, primarily from The 
Cancer Genome Atlas project’, have noted 
differential transcription of tumour-driving 
and tumour-suppressing genes in advanced 


tumours, but so many gene transcripts are 
altered in these tumours that it is difficult to 
know which ones ‘drive’ the altered behav- 
iour and which ones are ‘passengers, just 
going along for the ride. Furthermore, epi- 
genetic alterations — those that modify 
gene expression without involving sequence 
mutations — have been implicated in cancer, 
including colorectal cancer*”. Broad surveys 
of transcriptional and epigenetic changes in 
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tumours have been conducted®”, but not on 
the scale and resolution achieved by Ongen 
and colleagues. They used a method known 
as RNA-Seq, in which the transcriptome of a 
cell (its whole complement of RNA molecules) 
is sequenced. 

Over the past couple of years, sequencing 
of the exome of cancer cells (in essence, just 
the protein-coding regions) has suggested 
that around 250 genes are mutated in cancer 
cells significantly more often than expected 
by chance*°. Many of these are pan-cancer 
genes, and some are tumour-type specific. It is 
less straightforward to perform similar analy- 
ses for regulatory DNA, for two reasons: we 
are just beginning to learn how to identify reg- 
ulatory functions in the hundreds of kilobases 
that surround genes, and altered gene expres- 
sion is often due to changes in genes located 
elsewhere in the genome. Ongen et al. over- 
come these limitations by focusing on changes 
in the ratio of expression of heterozygous 
alleles (sites at which the DNA sequence dif- 
fers between the two copies of the sequence in 
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Figure 1 | Cryptic regulation in tumour cells. a, In normal cells, the two copies (alleles) of a gene in 
which the alleles are heterozygous, meaning that they contain a different nucleotide at a specific site 
(here, A and C), are both expressed at the same, low level. In this example, the alleles in the transcript are 
associated with a normally irrelevant, or ‘cryptic, variant allele in a regulatory region of the gene (A with 
T and C with G). b, In a tumour cell that is characterized by enhanced expression of a transcription factor, 
this cryptic variant becomes relevant, because the transcription factor can bind to the T-containing allele 
but not to one containing G. This leads to enhanced transcription of the gene containing the A allele 
relative to that containing C, and thus an altered allelic-expression ratio. 
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a cell) between tumour and matched normal 
cells, as had also been done in another recent 
analysis of colorectal cancer’. They call the 
hundreds of instances of this phenomenon 
that they find per sample ‘genes with allelic 
dysregulation (GADs). 

Although allele-specific expression can 
also be attributed to changes at other genes, 
it is highly likely that in many cases it is due 
to alocally acting regulatory mutation. Ongen 
et al. observe a significant correlation between 
the somatic (non-germline) mutation rate 
and altered allele-specific expression and, 
in each of the 103 matched normal-tumour 
pairs they analysed, approximately 200 tran- 
scripts showed a cancer-specific deviation in 
the allelic expression ratio at heterozygous 
sites. Their interpretation is that one allele is 
transcribed more than the other owing to the 
action of regulatory-sequence variation. 

The authors show that some of this deviation 
can be attributed to familiar cancer-associated 
mutation types, including loss of heterozygosity 
and copy-number alteration, and that some is 
due to inferred (yet to be defined) regulatory 
mutations. Tallying these instances over all 
of the samples, and taking two approaches to 
controlling for statistical biases, they arrive at 
alist of 71 GADs that occur more frequently 
in tumours than in normal cells, 9 of which 
are shared with an existing list of pan-cancer 
driver mutations®. These observations provide 
a smoking gun for the idea that regulatory 
mutations can drive cancer. Perhaps there is no 
need to distinguish them with their own name, 
but the term ‘GPS mutations’ comes to mind, 
because they are instructing driver mutations 
on what to do, but it is not altogether clear that 
the cancer cells would not still attain a tumor- 
ous state without their help — much like a 
satellite-based navigation system instructing a 
driver on how to get somewhere. 

A related term, ‘back-seat driver’; has been 
invoked to describe another class of mutation 
that probably has a role in mediating cancer 
progression or metastatic spread, and that 
is conditional on the status of other driver 
mutations’’. Ongen and colleagues’ second 
major contribution is to suggest that, in addi- 
tion to GPS mutations in GADs, another 
important source of cancer regulation is cryp- 
tic genetic variation (Fig. 1). These are genetic 
variants that are not relevant under normal cir- 
cumstances, but become so only in a perturbed 
state’. They may play a key part in modifying 
the expression of cancer-driving genes. Spe- 
cifically, when the authors looked for common 
variants that associate with differences in gene 
expression among individuals, they found that 
at least one-third of the expression-regulating 
variants (eSNPs) are tumour specific. 

Furthermore, these polymorphisms are 
enriched for binding sites for six known can- 
cer-related transcription factors, all of which 
are upregulated in colorectal cancer. The 
idea is that when one of these factors (IRX3, 
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E2F4, NFIL3, TFAP2A, CUX1 or LEF1) is in 
excess, polymorphisms that in normal cells 
do not influence the expression of the adja- 
cent gene become relevant. Whether or not 
these are key back-seat modifiers of cancer 
progression is unclear, because there is only 
a mild enrichment for these polymorphisms 
in a genome-wide association study (GWAS) 
of colorectal cancers". Perhaps they would 
be more enriched in a GWAS that assessed 
tumour progression. 

Two words of caution about this study are 
in order. The first is that there has been no 
attempt to demonstrate functionality of any 
of the candidate mutations — the findings are 
all based on statistical association. The second 
is that anyone using RNA-Seq quickly real- 
izes that there are many points at which the 
analysis can provide divergent results. Given 
recent interest in the repeatability of findings, 
it could be argued that it would be a good idea 
for journals to require independent parallel 
analyses from a different group, conducted 
blind, to corroborate such results before 
publication. This suggestion is not made to 
denigrate the careful and insightful analyses 
reported by Ongen et al., but is rather a generic 
comment on the inherent complexity of RNA- 
Seq, GWASs and enrichment analyses. Differ- 
ent analysts are likely to find quite different 
details. Yet the prospect that acquired variants 
drive cancer by controlling gene expression 
against a background of cryptic regulatory 
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modifiers opens up a new perspective on 
cancer research. Similar analyses can now be 
performed on other data sets, and also on dis- 
eases other than cancer in which regulation of 
gene expression is altered’*’*. The next chal- 
lenge is to establish the clinical utility of the 
identified regulatory variation. m 
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Alzheimer’s disease 


under strain 


Two studies of amyloid-{ protein aggregates, which cause Alzheimer’s disease, 
find that different conformations of the aggregates can define different strains of 
the disorder, drawing parallels with prion diseases. 


ADRIANO AGUZZI 


crucial event in the initiation of 
Alzheimer’s disease is the formation 
of amyloid-B protein aggregates in 
the brain. Several studies indicate that these 
aggregates can acquire distinct shapes’. Now, 
two papers” published in Proceedings of the 
National Academy of Sciences demonstrate 
that each amyloid-f protein of a given shape 
can seed more aggregates of that same shape. 
This suggests that there are several variants 
of Alzheimer’s disease, similar to influenza 
strains causing clinically distinct epidemics. 
The prion, which causes spongiform 
encephalopathies (neurodegenerative diseases 
such as Creutzfeldt—Jakob disease in humans 
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and scrapie in sheep), remained shrouded 
in mystery for decades. But over the past 
20 years, the ‘protein-only’ prion hypothesis* 
has become widely accepted. This states that 
prions arise from an abnormal version of the 
cell-membrane protein PrP, called PrP§S which 
forms misfolded protein aggregates, or plaques. 

Surprisingly, given the protein-only 
hypothesis, prions isolated from different 
sources (such as sheep, deer and cattle) target 
different brain areas, induce unique symptoms 
and have varied incubation times, indicating 
that there are different strains’. More discon- 
certingly, strain-specific properties are main- 
tained even when prions from different sources 
are transmitted through several generations 
of inbred mice with identical genes encoding 


PrP (ref. 6; Fig. 1a). In typical viral and 


bacterial infections, genetic variations a 


control strain-specific properties, but 
this mouse experiment suggests that the 
heritable traits of prions must somehow 
be encoded within the protein, most 
probably within its shape. Much evi- 
dence has accrued in favour of this the- 
ory, with several papers’ indicating that 
PrP* can adopt different conformations. 


Although the detailed structure of b 


PrP* remains unresolved, the protein 
seems to be very similar to amyloids, a 
family of filamentous structures that 
form protein aggregates, as PrP* does. 
Amyloid aggregates are implicated in 
many neurodegenerative diseases — 
including Alzheimer’s disease and Par- 
kinson’s disease — and are found in a 
growing list of body sites in addition to 
the brain. Therefore, understanding the 
details of prion and amyloid assembly 
and replication is of broad relevance. 

If prions can self-assemble in different 
conformations, and amyloid-f shares 
properties with PrP*, might amyloid-B 
also arise in different strains? To investi- 
gate this, Watts et al.” isolated amyloid-B 
from two patients who had died from the 
sporadic form of Alzheimer’s disease, and 
from two patients with hereditary Arctic 
or Swedish forms of Alzheimer’s disease. 
The mutation that causes Arctic Alzhei- 
mer’s disease changes the DNA sequence 
of the gene encoding amyloid-f, whereas 
the Swedish mutation occurs outside this 
genetic region and augments the cleavage of 
amyloid-f from a larger precursor protein. 

Watts and colleagues inoculated mice 
engineered to overexpress amyloid-6 with 
brain samples from the four patients and from 
one healthy human control. Confirming previ- 
ous studies”, they found that inoculation with 
amyloid-6-containing samples accelerated the 
formation of amyloid-f plaques in all mice. 
However, Swedish and sporadic amyloid-f dif- 
fered from Arctic amyloid-f in many ways. In 
particular, the amyloid-f aggregates that formed 
in mice exposed to the Arctic inoculum were 
much less resistant to physical disassembly than 
were aggregates derived from any other inocu- 
lum. This may seem counterintuitive, because 
the Arctic mutation is highly aggressive, but it 
conforms with the finding” that more-fragile 
fibrils have greater seeding potential and, by 
inference, are more aggressive. 

Crucially, the authors found that the Arctic 
inoculum maintained its strain-specific prop- 
erties (‘strainness’) even when they recovered 
samples from inoculated mice and transmit- 
ted them to a second generation of genetically 
identical mice (Fig. 1b). The significance of 
this result may be clear to virologists, but such 
experiments have rarely, if ever, been performed 
in the context of non-PrP* amyloids. Because 
the amyloid-f aggregates that are passed 
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Figure 1 | Information encoded in protein aggregates. a, 
Prion diseases with differing characteristics, including bovine 
spongiform encephalopathy (BSE), sheep scrapie and chronic 
wasting disease, are caused by aggregation of the same protein, 
PrP“. These distinct traits are maintained when disease samples 
are inoculated into the brains of mice and the accumulating 
aggregates are subsequently transferred through genetically 
identical mice. b, Similarly, Watts et al’ report that the strain- 
specific traits of three forms of Alzheimer’s disease — Swedish 
and sporadic, which have similar traits, and Arctic, which is 
more aggressive — are maintained during inoculation and 
serial transmission of each strain through two generations 
of genetically identical mice that overexpress amyloid-f. 
Stohr et al. produce similar results with two synthetic forms of 
amyloid- (not shown). This suggests that strain-specific traits 
of prion diseases and Alzheimer’s disease are encoded by the 
conformations in which protein aggregates form. 


onto the second mouse are almost entirely of 
mouse origin, any retained strainness must be 
enciphered in the conformation of amyloid-B 
protein, rather than in its gene sequence. 

In an accompanying study, Stéhr et al.’ 
provide evidence that two synthetic variants 
of amyloid-f that are 40 or 42 amino-acid 
residues in length (AB, and AB,,) cause dis- 
tinct disease characteristics when injected into 
mice overexpressing amyloid-f. This finding 
suggests that distinct amyloid-f strains can be 
generated from chemically defined, homo- 
geneous proteins — or that AB) and AB,, cause 
different disease characteristics because they 
have different gene sequences that dictate dif- 
ferent kinetics of aggregate assembly”. 

The current studies present convincing 
evidence for the existence of conformation- 
enciphered amyloid-f strains in patients with 
Alzheimer’s disease, but do not provide struc- 
tural information about the protein confor- 
mations involved. Last year, conformations of 
amyloid- in two patients with sporadic Alz- 
heimer’s disease were studied’ at extremely 
high resolution. This experiment showed that 
the two patients harboured distinct confor- 
mations of amyloid-f aggregates, potentially 
representing distinct strains. 

Excitingly, the existence of amyloid- strains 
might explain some of the current limitations 
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of therapies for Alzheimer’s disease. 
Immunization with amyloid-f antibod- 
ies is extremely effective in mouse models 
of Alzheimer’s disease’’, but translation to 
humans has proved disappointing. Might 
distinct amyloid-f strains be better tar- 
geted with strain-specific antibodies? If 
so, precise strain testing in Alzheimer’s 
disease may lead to better patient stratifi- 
cation and higher response rates™*. 
These papers assert that amyloid-B 
aggregates are prions. However, key dif- 
ferences set PrP apart from amyloids. 
Prions are by definition transmissible, 
and have caused epidemics — bovine 
spongiform encephalopathy, for exam- 
ple, has killed almost 200,000 cattle, 
and its transmisson to humans in the 
form of variant Creutzfeldt—Jakob dis- 
ease has caused more than 200 deaths. 
By contrast, there is no evidence that 
Alzheimer’s disease is infectious among 
humans, and the transmissibility of 
amyloid-f is limited to mice that are 
genetically programmed to be aggrega- 
tion-prone. Therefore, amyloid- aggre- 
gates are distinct from prions — even if 
they exhibit strainness. These clinical 
and epidemiological differences are 
important for human medicine, and are 
evidence against lumping amyloidoses 
and prion diseases together. In fact, it 
might be more appropriate to classify 
amyloid-B aggregates as prionoids”’. 
Moreover, if we classify amyloid-B 
aggregates as prions, we need to treat 
them accordingly. PrP* is subjected to man- 
datory confinement (biosafety level 2 or 3, 
depending on the prion strain), and in the 
United States, prions are considered to be 
bioterrorism agents. Expensive medical 
devices that have been used on people who 
are infected with prions must typically be 
destroyed to prevent accidental disease trans- 
mission. If amyloid-B aggregates are truly 
equivalent to prions, such draconian meas- 
ures may be necessary — and inevitable. But it 
would seem prudent to accrue more evidence 
that amyloid-f is genuinely infective before 
implementing such measures. = 
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doomed star 


Some stars explode in thermonuclear supernovae, but understanding of why this 
occurs comes mainly from indirect clues. Now, the progenitor of a member of a 
strange class of such explosions may have been detected directly. SEE LETTER P.54 


STEPHEN JUSTHAM 


xplaining the nature of supernovae is one 
He: the classical problems in astronomy. 

Supernovae are not only enticing mys- 
teries in which explosions of awesome power 
and brilliance are perpetrated by well-hidden 
culprits, they are also of broad importance to 
astrophysics — so our lack of certainty about 
the progenitors of some supernovae is embar- 
rassing. On page 54 of this issue, McCully et al." 
report that a combination of good fortune and 
careful analysis has pointed them to the prob- 
able system behind a particularly puzzling type 
of supernova. 

In ancient times, people interpreted new 
lights in the sky as heavenly signs of earthly 
fates. Now, astronomers use a form of super- 
nova known as type Ia to infer the history and 
future of the Universe. Those ancient new 
lights appeared to be new stars only because 
the stars that produced the celestial displays 
had previously been too faint to see. We now 
have a similar problem with supernovae. 
Although supernova explosions can be seen 
across more than half the age of the Universe, 


_ 


detecting pre-explosion stars is difficult even 
when they are massive and luminous. For 
such supernovae we still only have a small 
number of definitive pre-explosion detec- 
tions. The progenitors of type Ia supernovae 
have proved even more elusive. This type of 
supernova is thought to occur when a star 
called a white dwarf undergoes runaway 
nuclear fusion, but we have yet to directly 
see what circumstances cause this to happen. 
One such system may have been detected in 
X-rays’, but perhaps the clearest trace of these 
progenitors has been in material lingering 
close to the explosion** . However, theorists 
have been more successful in making multiple 
models consistent with the presence of that 
material than in convincing everyone to agree 
on one interpretation. 

We are therefore lucky that two supernovae 
separated by only a decade have been detected 
in the spiral galaxy NGC 1309 (Fig. 1). After 
the first explosion, detected in 2002, a team 
led by one of McCully’s co-authors used the 
Hubble Space Telescope to stare at the host 
galaxy®. The astronomers did this so that they 
could compare the distance to NGC 1309 


Figure 1 | A double bill. The spiral galaxy NGC 1309, which is about 30 million parsecs from Earth, 
hosted a type Ia supernova in 2002, SN 2002fk, and then a type Iax explosion in 2012, SN 2012Z. McCully 
et al.' have detected the probable progenitor of SN 2012Z in pre-explosion images obtained by the Hubble 


Space Telescope (see Fig. 1 of the paper’). 
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inferred from two methods — one using the 
2002 supernova and the other involving a 
type of star for which the method has been 
well-calibrated with nearby examples — and 
improve the precision with which we measure 
even larger cosmological distances. So when 
the second explosion occurred, in 2012, there 
was a wealth of data to examine for signs of the 
supernova progenitor. 

At first, the 2012 supernova was classified 
as a type Ia (ref. 7), although one with very 
unusual features. However, it and similar 
supernovae have become known as type Iax, 
following the name of a peculiar prototype 
event called SN 2002cx (refs 8,9). Astrono- 
mers sometimes worry too much about 
classification: dividing objects that show 
continuous variation in physical properties 
such as luminosity and velocity into discrete 
categories can become unhelpful. However, 
this new name reflects the realization that the 
2002cx-like explosions may be even less like 
normal type Ia supernovae than first thought. 
Superficially similar qualities might have pre- 
viously misled us into doing something akin 
to classifying a platypus as a type of duck. 

McCully and colleagues have found a 
source in the pre-explosion images from the 
Hubble Space Telescope whose location closely 
matches that of the 2012 supernova. They also 
show that this alignment has only a roughly 
1% probability of being a coincidence, which 
leads them to conclude that this source is 
very likely to have produced the supernova. 
Although apparently exciting coincidences can 
happen by chance, there have been very few 
pre-supernova observations that could have 
identified a progenitor that resembles this one. 
So this result is not due to astronomers tak- 
ing snapshot after snapshot and waiting for an 
interesting-looking but meaningless alignment 
to randomly occur. 

Assuming that the observed source is the 
progenitor, the first question in interpreting 
the results is whether it was emitting mainly 
simple starlight. Because we might also 
expect light to be generated by matter falling 
onto the white dwarf before it explodes, it is 
possible that McCully et al. have detected a 
signal from this process. However, the authors 
argue that it is more likely that the source they 
have identified is a star made of helium, which 
has been transferring matter onto an unseen 
white dwarf. This option is tempting because 
it matches some models for type Iax super- 
novae in which the white dwarf accretes a 
surface helium layer that eventually ignites in 
a thermonuclear runaway, in turn triggering 
fusion in the underlying star. 

If this explanation is true, then the source 
should still be present when the light from 
the supernova has faded sufficiently to allow 
the researchers to look for it again. However, 
the supernova might well have heated the 
helium star, stripped it of some surface layers 
or polluted it. It would be frustrating if the star 
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appears exactly as it did before the explosion, 
and in that case we would have to worry more 
about whether the true progenitor was too 
faint to have been detected. It is to be hoped 
that future observations of this source will help 
us to understand not just type Iax events, but 
also the impact of supernovae on nearby stars. 
Another possibility noted by McCully and col- 
leagues is that this supernova was the death of a 
massive star, in which case the source will have 
disappeared. This conclusion would show that 
Iax events truly are produced by qualitatively 
different systems from type Ia supernovae. 

If McCully et al. have identified the pro- 
genitor, then the observation will be one of 
the most memorable signposts on the road 


to understanding supernovae. And because 
it may improve our knowledge of what can 
happen after a layer of helium ignites on the 
surface of another potentially explosive collec- 
tion of fusion fuel, it could also help to explain 
the progenitors of other recently discovered 
events for which variations of this model have 
been proposed'””’. The 2012 supernova in 
NGC 1309 has not yet yielded all its secrets, 
but this discovery might help to solve mysteries 
both old and new. m 
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Early treatment may 
not be early enough 


Giving monkeys antiretroviral therapy from just three days after exposure to simian 
immunodeficiency virus does not prevent a subsequent rebound of viral replication, 
suggesting that viral reservoirs are established early. SEE LETTER P.74 


KAI DENG & ROBERT F. SILICIANO 


Ithough antiretroviral therapy (ART) 

is successful in controlling HIV-1 

replication, the virus persists in a sta- 
ble latent reservoir in infected cells that have 
entered a resting state’. In these immune 
cells, called resting memory CD4' T cells, the 
viral genome hides as pure genetic informa- 
tion integrated into the cells’ DNA (as pro- 
viral DNA), unaffected by ART or immune 
responses. But when the cells are subse- 
quently activated, this viral reservoir again 
releases replication-competent HIV-1, and 
it is therefore considered the main barrier to 
curing HIV-1 infection’. Despite vigorous 
efforts to understand the latent reservoir in 
the hope of finding ways to purge it, it has 
been unclear when it is seeded and whether 
early treatment can prevent this. On page 74 
of this issue, Whitney et al.’ provide evidence 
in the simian immunodeficiency virus (SIV) 
model of HIV-1 infection that the reservoir is 
seeded very early — during the first few days 
of infection. 

It had been assumed that the initial seeding of 
the latent reservoir occurs during acute HIV-1 
infection, at a point when viraemia — the pres- 
ence of viruses in the blood — has risen to a 
high level’, It was proposed that if ART is begun 
before peak viraemia occurs, this might prevent 
the reservoir from becoming established, or at 
least significantly reduce its size. Recent clinical 
studies confirmed that early ART can indeed 


reduce the size and dissemination of the viral 
reservoir”, and even promote long-term con- 
trol of the virus in some infected individuals 
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Interest in the possibility of viral-eradication 
strategies based on early ART initiation 
was further heightened by the case of the 
‘Mississippi baby, a child who was born to 
an infected mother and who had around 
20,000 viral copies per millilitre of blood 
plasma at birth’. The child was started on 
ART 30 hours after delivery and was treated 
for 18 months. The virus was undetectable 
by 29 days, and remained so for 27 months 
after treatment was stopped, when rebound 
viraemia was detected’. Delayed and highly 
variable time to rebound is the predicted out- 
come of interventions that reduce the reservoir 
toa very low level”. 

To evaluate the temporal dynamics of 
initial viral-reservoir seeding, Whitney and 
colleagues treated rhesus macaques starting 
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Figure 1 | SIV eradication strategies. a, Activation of naive CD4" T cells renders the cells highly 
susceptible to infection with simian immunodeficiency virus (SIV), which becomes integrated into the 
host-cell genome to allow viral replication. Most CD4* T cells die rapidly after infection, but a small 
fraction survives and reverts back to a resting memory state, in which SIV gene expression is turned off, 
resulting in a latent reservoir of the virus. Subsequent activation of these cells can restart virus production. 
b, Antiretroviral therapy (ART) soon after infection can stop more cells from becoming infected, but does 
not affect the fate of already infected cells, and some survive to seed the latent reservoir. Whitney et al.° 
show that a viral reservoir is established within days of SIV infection. ¢, In vaccinated animals, 
SIV-specific cytotoxic T cells that are generated in response to the vaccine can kill infected cells 

before they revert back to the resting state, thereby preventing the establishment of a latent reservoir. 
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on days 3, 7, 10 or 14 after SIV infection. 
Although initiating treatment on days 7, 10 
and 14 significantly reduced peak plasma 
virus levels, treatment from day 3 completely 
blocked the emergence of detectable viraemia; 
this was also evidenced by the absence of 
SIV-specific antibody-based and cellular 
immune responses in these animals. The 
authors found no proviral DNA in the animals’ 
peripheral-blood mononuclear cells (which 
include CD4* T cells) before treatment initia- 
tion on day 3, but proviral DNA was already 
detectable in their lymph nodes and the 
mucosal linings of the gastrointestinal tract. 
This crucial finding suggests that the viral 
reservoir may be first seeded in the lymphoid 
and mucosal tissues, a result with important 
implications for HIV-1 eradication strategies. 

Most significantly, the authors observed 
viral rebound in all animals after ART was 
stopped. This occurred even when ART that 
fully suppressed detectable viraemia was ini- 
tiated at day 3 and continued for 6 months, 
a treatment period that allows elimination 
of labile infected cells and thus reveals stable 
reservoirs. The observed rebound suggests 
that the viral reservoir is seeded surprisingly 
early in SIV-infected animals. However, the 
animals treated from day 3 showed a slightly 
delayed viral rebound compared with those 
that started ART at later times. Using a sophis- 
ticated model of viral dynamics, the authors 
show that the time to viral rebound is cor- 
related with total viraemia during the acute 
phase of infection. 

These data indicate that the viral reservoir 
could be seeded substantially earlier than pre- 
viously assumed — a sobering finding that 
poses additional hurdles to HIV-1 eradica- 
tion efforts. If this evidence from SIV-infected 
animals reflects what happens early in HIV-1 
infection in humans, it would mean that it is 
nearly impossible to initiate ART before viral 
reservoirs have seeded, because viraemia is not 
detectable at this point. In other words, reser- 
voir seeding precedes any clinical evidence of 
infection. However, although early treatment 
may not prevent reservoir seeding, it has been 
consistently shown to reduce the size of the 
latent reservoir®’, and infected individuals 
who receive early treatment could have a lower 
barrier to cure in future eradication strategies. 

Whitney and colleagues’ findings are of 
particular interest in light ofa study last year’® 
reporting that a disseminated SIV infection 
could be cleared by vaccine-induced T-cell- 
based immune responses. The different out- 
comes of these two studies may be partly due to 
the fate of infected cells during acute infection. 
Early initiation of ART immediately stops sub- 
sequent new infection of susceptible cells, but 
does not affect the fate of cells that are already 
infected. A small fraction of these infected 
cells survive and revert back to a resting state, 
thereby seeding the latent reservoir (Fig. 1). By 
contrast, vaccinated animals have pre-existing 
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SIV-specific cytotoxic T cells that can clear the 
infected cells before they go into latency, thus 
preventing the viral reservoir from becoming 
established. 

It remains to be seen whether clinical studies 
will confirm Whitney and colleagues’ obser- 
vations, because substantial differences exist 
between SIV infection in rhesus macaques and 
HIV-1 infection in humans. As mentioned by 
the authors, the SIV dose used in their study 
may be much higher than the typical amount 
of HIV-1 involved in sexual transmission, per- 
haps resulting in a higher level of early viral 
replication. Nevertheless, the striking findings 
of the early seeding of the viral reservoir in 
mucosal and lymphoid tissues before viraemia 
is detected suggest that new approaches, in 
addition to early treatment, will be necessary 
to eradicate HIV-1 infection. m 
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Tooth structure 
re-engineered 


Mice deficient in the EDA protein lack normal tooth features. Restoring EDA in 
embryonic teeth at increasing doses has now been found to recover these dental 
features in a stepwise pattern that mimics evolution. SEE ARTICLE P.44 


ZHE-XI LUO 


fundamental connection between 

developmental changes and evolution 

has long been established’. This link 
is gaining renewed emphasis’ as molecular 
studies shed new light on evolution by reveal- 
ing many genetic modifications that alter 
developmental processes, in turn changing an 
organism's shape and structure. On page 44 
of this issue, Harjunmaa et al.’ report that, by 
simply tinkering with the genes and signalling 
pathways that control the shape of develop- 
ing teeth, they have remade several different 
tooth structures in vitro. These structures draw 
a striking parallel with how teeth evolved from 
those of distant mammalian ancestors to the 
teeth of modern-day rodents. 

Some lineages of therians (marsupial 
and placental mammals and their kin) that 
lived in the Mesozoic era, 252 million to 
66 million years ago, had ‘tribosphenic’ 
molars’. The taller front end of the tribos- 
phenic lower molar, called the trigonid, had 
three cusps — the raised points on the crown 


© 2014 Macmillan Publishers Limited. All rights reserved 


of the tooth — for shearing food. The lower 
back end, the talonid, had a basin-like surface 
for grinding food". The trigonid and talonid 
of Mesozoic mammals are still recognizable in 
modern-day rodents, albeit in a highly modi- 
fied form. As rodents arose from ancestral 
mammals and diversified into many line- 
ages, cusps that were separate in the Palaeo- 
cene epoch 66 million to 56 million years ago’ 
became progressively connected by crests — 
which are more effective for chewing plant 
food — in a ‘cusp-to-crest’ dental evolution 
that occurred in many rodent groups*”. 

The ectodysplasin A (Eda) gene encodes a 
vertebrate signalling protein that is involved 
in the development of a wide array of struc- 
tures, from hair to sweat glands”. In the 
embryonic tooth, the EDA protein is active 
in enamel knots, which are signalling centres 
and the precursors of adult tooth structures. 
EDA regulates the position and size of future 
tooth cusps and their connecting crests’. 
Mice that do not express Eda lack normal 
cusps and crests, and instead have only basic, 
generalized teeth”. 
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Figure 1 | Reconstructing tooth evolution in vitro. a, As mammals evolved, their molars (one indicated 
in the jaw) became ever more complex, because extra tooth features evolved over time. Structures called 
trigonids (dark grey) evolved first, in early mammals such as the Mesozoic symmetrodonts, followed by 
talonids (light grey) in a group of Mesozoic therians called cladotherians. Hypoconulids (blue) evolved 

in Mesozoic therians, and anteroconids (yellow) in advanced rodents. The anteroconids are similar in 
structure to pseudo-talonids, which evolved separately (convergently) in pseudo-tribosphenic teeth in an 
early-divergent clade of mammals (the pseudo-talonid is also indicated in yellow). b, Deletion of the 

Eda gene in mouse embryos results in the loss of all of these tooth features. Harjunmaa et al.’ show in vitro 
that addition of the EDA protein to embryonic teeth from Eda-deficient mice in increasing doses can 
replay the steps of evolution. Furthermore, features that evolved longer ago respond in a less variable 


manner than features that evolved more recently. 


Harjunmaa and colleagues grew Eda- 
deficient teeth in vitro, and found that cusps 
and crests could be restored by adding EDA. 
They demonstrated, with the aid of computer 
models, that different doses of EDA alter 
tooth morphogenesis (the process by which 
structures are shaped as they develop), akin 
to the dental transformations that occurred as 
rodents evolved from their Mesozoic mamma- 
lian ancestors (Fig. 1). For example, the trigo- 
nid — the first part of the tribosphenic molar 
to have evolved — is regenerated with only a 
small dose of EDA. However, a higher dose of 
EDA is required to restore the talonid, which 
evolved more recently’. 

The cusp-to-crest morphogenesis of mouse 
molars is controlled by a gene network that 
includes genes encoding the signalling pro- 
teins fibroblast growth factor 3 (Fgf3; ref. 9) 
and sonic hedgehog (Shh)'*. An increase 
or decrease in Fgf3 causes over- or under- 
development of tooth features, respectively’. 
Harjunmaa and co-workers found that reduc- 
ing the concentration of SHH in Eda-deficient 
teeth regenerated the ancestral features of Pal- 
aeocene rodents, reversing the cusp-to-crest 
transformation of modern-day rodents. Thus, 
molecular manipulations that alter tooth 
morphogenesis in vitro can replay evolution, 
either forward, to mimic the fossil record, or 
in reverse. 

Perhaps the most exciting insight from 


Harjunmaa and colleagues’ work is that 
ancestral structures show a more uniform 
response to the addition or removal of EDA 
or SHH than those that evolved more recently 
or independently in different lineages (con- 
vergent evolution). For example, addition of 
a low dose of EDA reliably restored the trigo- 
nid, as expected of ancestral features, which 
are typically evolutionarily well conserved 
owing to their long history. By contrast, higher 
doses restored the talonid in many, but not all, 
teeth — development of this feature was more 
variable in response to EDA. This is consistent 
with the theory’ that the talonid basin evolved 
convergently in different mammal lineages, 
but has reduced in size in some carnivoran or 
insectivoran mammals. 

The hypoconuilid is a talonid cusp in some 
mammals, but is enlarged and forms a sepa- 
rate lobe in mice. The authors found that 
full development of this structure requires a 
higher dose of EDA than does the rest of the 
talonid, and shows even wider variation in 
its response to EDA. Finally, the anteroconid 
in mice — another feature that arose late in 
rodent evolution — requires the highest EDA 
dose to regenerate, and shows the broadest 
variation when regenerated. Its position on 
the tooth corresponds to the ‘pseudo-talonid’ 
that arose in some early-divergent mammals 
that died out before the end of the Mesozoic. 
Harjunmaa and co-workers’ experiment 
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therefore demonstrates that modern-day 
mice still have the developmental potential 
to replicate evolutionary events that occurred 
long ago, in the now-extinct mammals of the 
Mesozoic”. 

The level of EDA required to give rise to 
individual molar characteristics therefore 
seems to provide information about how 
robust their development is. When studying 
how morphogenesis drives evolution”, it will 
be crucial to bear in mind that the sensitivity 
of a particular tooth feature to EDA activity 
may indicate the likelihood of an evolution- 
ary transformation producing that feature. 
For example, as mentioned above, talonid-like 
features evolved twice — in the basal diversi- 
fication of modern mammals and in early- 
divergent groups of the Mesozoic. Variable 
sensitivities to gene-expression dosage and sig- 
nalling strength can serve as a measure of the 
evolutionary variability of each tooth feature, 
and may underpin the many convergences 
and reversals of tooth evolution observed in 
the mammalian fossil record. 

Eda and Shh have varying effects on many 
vertebrate structures, so it can be hard to tease 
apart which evolutionary feature is controlled 
by which part of the gene network. Harjunmaa 
et al. have cleared this hurdle in a welcome 
development that brings us closer to being able 
to test how changes in morphogenesis affect 
the final shape of evolving teeth as seen in the 
fossil record. Genetic engineering of develop- 
mental processes in vitro is a fruitful way to 
decipher how the shape of organs or other bio- 
logical structures is modified by evolution. m 
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Convergence of terrestrial plant 
production across global climate gradients 


Sean T. Michaletz', Dongliang Cheng”, Andrew J. Kerkhoff? & Brian J. Enquist 


1,4,5,6 


Variation in terrestrial net primary production (NPP) with climate is thought to originate from a direct influence of tem- 
perature and precipitation on plant metabolism. However, variation in NPP may also result from an indirect influence of 
climate by means of plant age, stand biomass, growing season length and local adaptation. To identify the relative impor- 
tance of direct and indirect climate effects, we extend metabolic scaling theory to link hypothesized climate influences 
with NPP, and assess hypothesized relationships using a global compilation of ecosystem woody plant biomass and pro- 
duction data. Notably, age and biomass explained most of the variation in production whereas temperature and precipi- 
tation explained almost none, suggesting that climate indirectly (not directly) influences production. Furthermore, our 
theory shows that variation in NPP is characterized by acommon scaling relationship, suggesting that global change models 
can incorporate the mechanisms governing this relationship to improve predictions of future ecosystem function. 


Annual net primary production (NPP; gm 7 yr_') of woody plants is 
a major component of the terrestrial carbon cycle’. Although it has long 
been known that NPP correlates with mean annual temperature”? and 
mean annual precipitation*~, efforts to understand the interactions between 
climate and metabolism have resulted in two conflicting generalizations. 
On the one hand, across a wide range of ecologically relevant temperatures, 
rates of photosynthesis and respiration increase approximately expo- 
nentially with temperature to a critical value beyond which rates decrease’. 
Furthermore, photosynthetic rate is limited by water availability, as re- 
flected by a positive relationship between NPP and precipitation*’. These 
patterns suggest that climate influences NPP directly via metabolic kin- 
etics. On the other hand, across broad environmental gradients, local 
adaptation of thermal and edaphic tolerances may dampen physio- 
logical responses”*. Under this scenario, correlations between NPP and 
annual climate variables result not directly from variation in metabolic 
rates, but rather indirectly via variation in plant size and stand biomass*"°, 
stand age structure’ and growing season length”. 


Metabolic scaling theory for NPP 


To assess these differing viewpoints, we build upon metabolic scaling 
theory of forest structure and dynamics’*”* to test the relative importance 
of direct and indirect climate effects on NPP (Supplementary Information). 
First, according to metabolic scaling theory, variation in instantaneous 
rates of respiration, photosynthesis and growth scale predictably with 
plant size and temperature. Second, extensions of metabolic scaling theory 
to whole-ecosystem functioning predict that NPP will scale with stand 
biomass and size of the largest individual”'!’*. Third, to incorporate 
the effects of temperature, precipitation, growing season length and plant 
age, we assume that their effects are inherently multiplicative’. Asa result, 
hypothesized drivers of NPP can be assessed via the general equation 


se] : a 


NPP = P?? Ips" ae F/T, = 5 | Mio (1) 
Cn 


Here, the dependencies of NPP on precipitation P (mm), growing season 
length /,, (months (mo) yr ‘) and plant age a (yr) are characterized as 


power laws with exponents ap, Oba and «,, respectively (Extended Data 
Fig. 1). This derivation permits evaluation of nonlinear relationships 
as well as a hypothesized direct proportionality between NPP and I,, 
(that is, Olle = 1)”*. The influence of temperature T (K) is characterized 
by an Arrhenius relation with an activation energy E (eV) and Boltzmann’s 
constant k (8.617 X 10 > eV K '). Anactivation energy of 0.32 eV has 
been hypothesized for the kinetics of photosynthesis’. The influences 
of plant size and stand biomass are described by a size-corrected mea- 
sure of the stand size distribution c,, (where f(r) = dn/dr = c, r~* and 
r is stem radius; m), a normalization constant c,,, relating stem radius 
to plant mass (7 = C, m*; m ge 8), the total stand biomass Miot (g), 
the stand area A (m’), and a growth normalization constant g; 
(ent) yr ““mm~*» mo” “8 ), 

To best test metabolic scaling theory predictions and to evaluate direct 
and indirect climate effects, we recast equation (1) to give a more instant- 


aneous monthly net primary production (NPP/I,5; g m ~mo ') as 


Cn [5Cm*/3]” 
n 


1+ Algs 


NPP 
= Pp gra —E/kT 
les ee A 


where /,, (mo yr‘) is growing season length and g, is another growth 
normalization constant (g m~!~(9/3) mo~!~%s» mm~%? yr~%*), As dis- 
cussed below, g; and g, are governed by several prominent functional 
and physiological traits'*’° and may thus vary with stand characteris- 
tics such as soil fertility, leaf type and biome. Equations (1) and (2) can 
be linearized, respectively, as 


In(NPP) = Bo, + %pln(P) + a,,1n(Iys) 
E (3) 
+dqln(a) — rt? In(Muot) 


and 


ie (S*) = Boy + ap In(P) +0 In(a) — a +oln(Mro) (4) 
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where the intercepts are By; = In(gi) + In(c,/A) + In( [58/3 /3¢,| ‘) 
and Bo. = In(g2) + In(c,/A) + In{ [5c8/3/3¢,]*). 


Evaluating hypothesized drivers of NPP 


Here we use equations (3) and (4) to evaluate several long-standing 
hypotheses for direct and indirect climate effects on global variation 
in NPP and NPP/I,. Specifically, we conducted four separate analyses 
of globally distributed data on woody plant production. Our data set 
spans broad ranges of temperature and precipitation (Fig. 1a). 

First, in agreement with previous reports”, NPP is a significant cor- 
relate of mean annual temperature and mean annual precipitation (Fig. 1b, c 
and Extended Data Table 1). Furthermore, NPP and average annual tem- 
perature <1/kT>, followed an Arrhenius relationship (Fig. 2a and Ex- 
tended Data Table 1) with an estimated activation energy (E = 0.296; 
95% confidence interval (CI) = 0.268 to 0.324) that was not different 
from the hypothesized’’ 0.32 eV. Additionally, NPP was significantly 
influenced by stand biomass and decreased with plant age (Fig. 1d 
and Extended Data Table 2). 

Second, we examined NPP/I,, (Fig. 2) to assess how climate and eco- 
system variables influenced more instantaneous production rates. In con- 
trast to results for NPP, average growing season temperature <1/kT>,., 
mean annual precipitation, and mean growing season precipitation 
explained little to no variation in global NPP/I,, (Fig. 2b-d and Extended 
Data Table 1). The relationship between NPP/I,, and average growing 
season temperature (Fig. 2b) provided an estimate of E = —0.067 eV 
(95% CI = —0.131 to 0.003) that was significantly lower than and oppo- 
site in direction to the hypothesized 0.32 eV (ref. 15). This suggests that 
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Figure 1 | Global variation in annual net primary production for 1,247 
woody plant communities grouped by age class. a, Precipitation— 
temperature space occupied by the plant communities. Biome definitions from 
ref. 39. 1, tropical rainforest; 2, temperate rainforest; 3, tropical seasonal 
forest; 4, temperate forest; 5, taiga; 6, savannah; 7, woodland/shrubland; 8, 
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local adaptation may dampen the ambient temperature response between 
communities and that the correlation between NPP and mean annual 
temperature (Fig. 1b) is spurious. For example, mean annual tempera- 
ture was strongly correlated with growing season length (Extended Data 
Fig. 2). 

Sed, to assess hypothesized drivers of NPP while simultaneously 
controlling for the influence of all other model covariates, we fitted the 
complete model (equations (1) and (3)) to the data using multiple regres- 
sion (Table 1 and Extended Data Fig. 3). A large proportion of variation 
in NPP (full model adjusted R? = 0.769) was explained by just two vari- 
ables: stand biomass and plant age. Importantly, in contrast to pairwise 
correlations (Fig. 1b, c), this multivariate approach revealed that aver- 
age growing season temperature (partial r* = 0.073) and mean growing 
season precipitation (partial r* = 0.011) explained little of the varia- 
tion in NPP, and growing season length explained almost none (partial 
r = 0.004). The mass-scaling exponent was estimated as « = 0.763 (95% 
CI = 0.735 to 0.792), significantly greater than the metabolic scaling 
theory prediction of 3/5 = 0.60. Furthermore, the scaling exponent for 
growing season length (%,, = 0.058; 95% CI = —0.007 to 0.109) was 
significantly lower than the value of 1 required for direct proportionality”*. 
The estimated activation energy E = 0.195 eV (95% CI = 0.156 to 0.234) 
did not include the hypothesized 0.32 eV. These general conclusions did 
not change when using mean annual temperature and/or mean annual 
precipitation, or root, aboveground woody, and foliage components of 
NPP (Extended Data Table 3). 

Comparing the fit of the complete model (equations (1) and (3)) with 
a simpler model containing only plant age and stand biomass (NPP = 
ca” M?.,) yielded similar coefficients of determination but a higher Akaike 
information criterion (AIC) value for the complete model (AIC = 140.115, 
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tundra; 9, desert. b, Relationship between NPP and mean annual temperature. 
c, Relationship between NPP and mean annual precipitation. d, Relationship 
between NPP and stand biomass. Grey, 0-50 years; orange, 51-100 years; 
blue, 101-200 years; black, =201 years. 
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Figure 2 | Net primary production of woody plant communities across 
global climate gradients. a, Annual temperature. b, Growing season 
temperature. c, Annual precipitation. d, Growing season precipitation. Dark 
grey, annual net primary production (NPP); light grey, monthly net primary 
production (NPP/I,.); e, mathematical constant (~2.718). 


R? = 0.769) than for the simpler model (AIC = —1768.585, R? = 0.735; 
Fig. 3a and Extended Data Fig. 4), indicating that age and biomass together 
explained most of the variation in NPP (Fig. 3a). Soil and leaf trait dif- 
ferences did not influence scaling relationships, but did influence nor- 
malization constants (Fig. 3b, c and Supplementary Information). Biome 
differences affected scaling relationships in some cases and normalization 
constants in others (Extended Data Fig. 4 and Supplementary Information). 

Fourth, because metabolic scaling theory predictions are based on 
instantaneous rates, a more precise evaluation should consider instan- 
taneous NPP with climate variables most relevant to growth physiology. 
Therefore, we used multiple regression to assess metabolic scaling the- 
ory for rates of NPP/I,, (equations (2) and (4)) using average growing 
season temperature and mean growing season precipitation (Table 1 
and Fig. 4). With these data, all covariates were significant (Table 1). 
The explanatory ability of this model was reduced (adjusted r* = 0.440), 
probably due to error in growing season length estimates. Similar to 
NPP, stand biomass and plant age were the best predictors of variation 
in NPP/I,., with little of the variation explained by mean growing season 
precipitation (partial 7 = 0.095) and essentially none of the variation 
explained by average growing season temperature (partial 7 = 0.007). 
In support of metabolic scaling theory, the mass-scaling exponent was 
estimated as « = 0.613 (95% CI = 0.575 to 0.652), which is indistin- 
guishable from the metabolic scaling theory prediction of 3/5 = 0.60. 
However, the estimated activation energy E = —0.079 (95% CI = —0.130 
to 0.028) was significantly lower and opposite in direction to the hypoth- 
esized value of E = 0.32 eV (ref. 15). These conclusions did not generally 
change when using root, aboveground woody and foliage components 
of NPP/I,, (Extended Data Table 3). 


Climate has little direct effect on NPP 


Our analyses question several long-held and more recent conclusions 
regarding the influence of climate on global variation in NPP, and reveal 
that a number of central conclusions established in studies using bivari- 
ate regression are probably spurious. For example, mean annual temper- 
ature and mean annual precipitation are often cited as primary drivers of 
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Figure 3 | Global variation in annual net primary production of woody 
plant communities expressed as a general scaling function of plant age aand 
stand biomass M,,. a, 1,247 stands grouped by age class with ordinary least 
squares (OLS) regression line; b, 1,247 stands grouped by leaf functional trait 
type with standardized major axis (SMA) regression lines; c, 1,237 stands 
grouped by soil fertility class with SMA regression lines. Grey, 0-50 years; light 
orange, 51-100 years; light blue, 101-200 years; black, =201 years; dark orange, 
broadleaf; light green, needle-leaf; dark blue, mixed-leaf; pink, low soil fertility; 
yellow, medium soil fertility; dark green, high soil fertility. 


NPP (see refs 2-5 and Fig. 1b, c), but after controlling for plant age and 
stand biomass, temperature and precipitation explained little to none 
of the variation (Table 1 and Fig. 4a, b). Likewise, hypothesized activa- 
tion values of 0.32 eV have been supported by bivariate relationships (for 
example, Fig. 2a and ref. 15), but accounting for other covariates yielded 
estimates that did not support these predictions’ (Table 1 and Fig. 4a). 
These results are intriguing given that the temperature-dependencies of 
local scale photosynthesis and respiration are well established’. 

Three factors might account for the absence of direct climate effects. 
First, like most studies in plant ecology and ecosystem metabolism**”"""""*, 
our analyses considered ambient air temperature. However, air tem- 
peratures may not reflect the tissue temperatures that govern plant growth 
rates, because plant traits (thermophysical properties) can influence energy 
budgets and decouple plant tissue temperatures from air temperature”. 
This could dampen ambient temperature correlations across climate 
gradients if, for example, selection adjusts leaf traits to maintain leaf 
temperatures near photosynthetic optima’. As air temperature is one 
of the most commonly used climate variables in plant and ecosystem 
ecology, any directional plant-air temperature differences (as recently 
suggested'*) will have profound implications for our understanding of 
plant-climate interactions. Mean annual air temperature may yield es- 
pecially misleading results, because it can differ substantially from the 
operative temperatures of organisms, and it also covaries with other key 
drivers of metabolism and NPP (see, for example, Extended Data Fig. 2) 
that can act as confounding variables to produce spurious relationships. 
We suggest that future studies move away from using mean annual air 
temperature and instead use air and plant body temperatures measured 
during the growing season and/or key periods of development. Second, 
biochemical adaptation and/or acclimatization to cold temperatures may 
increase plant metabolism”*”’. For example, observed shifts in foliar chem- 
istry and metabolic efficiencies have been argued to offset variation in 
metabolic kinetics across temperature gradients*’’. Third, the climate 


7 AUGUST 2014 | VOL 512 | NATURE | 41 


©2014 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Table 1 | Multiple regression fits of theory (equations (3) and (4)) to a global compilation of data from 1,247 woody plant communities 


Dependent variable Covariate Coefficient Estimate 95% Cl s.e. t P value Partial 2 

NPP (g m-“yr~?) Bo 9.336 7.758 to 10.914 0.804 11.609 <2 x 10°16 0.098 
<I1/kT>g5 E 0.195 0.156 to 0.234 0.020 9.854 <2 x 10°16 0.073 
a Oa —0.568 =0.599 to. —0.537 0.016 —35.808 <2 x 10°16 0.508 
les lag 0.058 0.007 to 0.109 0.026 2.223 0.026 0.004 
Mot a 0.763 0.735 to 0.792 0.014 52.863 <2 x 10°16 0.693 
Pas ap 0.043 0.020 to 0.067 0.012 3.664 2.58 x 10° 0.011 

NPP/Igs (g m-2mo7?) Bo —1.652 —3.741 to 0.438 1.065 =1,551 0.121 0.002 
<1/kT>¢s E =0.079 —0.130 to 0.028 0.026 3.016 0.003 0.007 
a Oa =0.168 —0.392 to —0.310 0.021 =16.711 <2 x 10°16 0.184 
Mot a 0.613 0.575 to 0.652 0.020 30.995 <2 x 10°16 0.436 
Pgs ap —0.168 =0.197 to —0.139 0.015 —11.444 <2 x 10°16 0.095 


<1/kT>gs, average growing season temperature; a, age; E, activation energy; /,., growing season length; M,.:, stand biomass; Pg,, mean growing season precipitation; «, mass scaling exponent; ws, age scaling 


exponent; %,., growing season length scaling exponent; xp, precipitation scaling exponent; o, intercept. 


data used here were interpolated from 29-year climate station means” 
and are not necessarily representative of the years when production 
data were obtained. Thus, regressing short-term NPP estimates on longer- 
term climate estimates will add noise to the relationships. Nonetheless, 
while we are unaware if this error yields a directional bias across cli- 
mate gradients, future work should assess its importance. 

Terrestrial NPP increases asymptotically with precipitation*” because 
plant growth in terrestrial ecosystems is generally water limited. However, 
our analyses suggest that this relationship doesn’t occur through direct 
effects of precipitation on plant metabolism per se, but instead via in- 
direct effects of water availability on stand biomass and plant age. Although 
this is counterintuitive given the importance of water in whole-plant 
physiology (for example, avoidance of xylem embolism and control of 
stomatal aperture), it is consistent with a more hydrological view of 
plant-atmosphere interactions”'. Specifically, biomass controls the total 
leaf area that drives transpiration, assimilation and growth. Further- 
more, precipitation is not necessarily representative of plant-available 
water~, so rather than using precipitation as the primary measure of 
plant-available water, evapotranspiration’ should also be included. 


Plant age effects on NPP 


Even after controlling for stand biomass and climate, global NPP/ Ips 
decreased with age so that (for a given biomass) younger stands had 
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Figure 4 | Partial regression plots illustrating relationships between 
monthly net primary production (NPP/I,,) and individual covariates from 
equation (4) for 1,247 woody plant communities. Plots show the correct 
relationship (slope and variance) between NPP/I,, and each covariate while 
controlling for the influence of all other model covariates. a, Average growing 
season temperature; b, mean growing season precipitation; c, stand biomass; 
d, plant age; e, mathematical constant (~2.718). 
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higher rates of production. Although such age-related declines are well 
documented”’, the drivers of this pattern remain unclear. Numerous 
mechanisms have been proposed, including: (1) hydraulic limitation of 
plant height’; (2) changes in carbon use efficiency”®, potentially from 
increasing respiration requirements with size; (3) ontogenetic shifts in 
biomass allocation, with smaller plants having proportionately more 
foliage and higher assimilation rates; (4) increasing light limitation 
with age**’’; and (5) decreasing stand density with age’, which would 
act through the size-corrected size distribution term c,. Together, our 
results reiterate a need in global change studies for a deeper under- 
standing of the causal mechanisms linking age to plant production. 


Climate controls NPP via biomass and age 


Whereas temperature and water availability are fundamental drivers 
of plant physiology and ecosystem metabolism at local scales’””?”, at 
global scales they appear to have little direct kinetic control on NPP. 
Instead, our findings suggest that climate influences NPP indirectly 
via plant age and stand biomass (see ref. 11), which is largely driven by 
maximum plant size*’**. For example, plant age is influenced by time 
since last disturbance, and maximum plant size is constrained by limi- 
tations on the water and energy fluxes necessary to support basal 
metabolism***?”, 

Our theoretical framework further extends the predictive ability of 
metabolic scaling theory and links multiple hypothesized climate dri- 
vers to plant and ecosystem metabolism. Furthermore, it uniquely 
(1) integrates metabolic scaling and physiological approaches to plant 
ecology and global change studies; (2) underscores the importance of 
plant size, allometry and age as primary drivers of variation in ecosys- 
tem metabolism; and (3) provides a foundation to assess if adaptive dif- 
ferences in plant form and function can compensate for broad-scale 
climate gradients. Although our results support the mechanistic basis 
of metabolic scaling theory for the origin of the ecosystem mass-scaling 
exponent « (see ref. 11) and the influence of plant traits on variation in 
the normalization constants g; and g>, they also highlight a need for 
integrative theory that explains the age-dependence of terrestrial NPP 
(estimated here as ~, = —0.65). Additionally, future work is needed on 
the geographical scale at which the activation energy for plant meta- 
bolism becomes decoupled from temperature. 

We have shown that global variation in terrestrial NPP is consistent 
with the hypothesis that the diversity of plant form and function origi- 
nated via selection to maximize plant growth across climate gradients”* 
(Fig. 4). This has resulted in convergence to a common scaling relation- 
ship between NPP, plant age and total stand biomass (Fig. 3). Interestingly, 
recent analyses indicate that both mean annual temperature and mean 
annual precipitation are also poor predictors of total stand biomass” 
(but see ref. 33). Additionally, metabolic scaling theory predicts'’’* and 
recent empirical data show that the best predictor of total auto- 
trophic ecosystem biomass appears to be the size of the largest indi- 
vidual. Consequently, efforts to predict ecosystem function in response 
to global change should include the mechanisms that govern max- 
imum plant size*' and general ecosystem scaling relationships (Fig. 3). 
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METHODS SUMMARY 


Weassessed variation in NPP and NPP/1, across broad climate gradients using a global 
compilation of biomass and production data for 1,247 woody plant communities**>** 
(and Malhi, Y. et al, submitted) and climate data from a high-resolution gridded data 
set”? (see Methods). NPP was computed as the sum of annual production of root, 
stem, branch, reproductive (when available) and foliage components. To calculate 
NPP/I,., growing season length was calculated as the number of months with a mean 
minimum temperature greater than 0.6 °C anda moisture index MI > 0.048 (ref. 8). 
Temperature and precipitation were calculated as both annual and growing season 
averages, and temperature was also expressed as the Boltzmann factor exponent 
1/kT. Relationships between production and climate variables were first assessed 
using OLS linear regression. Next, equations (3) and (4) were fit to compiled data to 
evaluate relationships between NPP or NPP/l,, and hypothesized drivers. Finally, 
the fit of the complete model (equations (1) and (3)) was competed with that of a 
simpler model (NPP = ca” Mj,) obtained via multiple regression with age and 
biomass as covariates. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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The evolutionary relationships of extinct species are ascertained primarily through the analysis of morphological char- 
acters. Character inter-dependencies can have a substantial effect on evolutionary interpretations, but the develop- 
mental underpinnings of character inter-dependence remain obscure because experiments frequently do not provide 
detailed resolution of morphological characters. Here we show experimentally and computationally how gradual modi- 
fication of development differentially affects characters in the mouse dentition. We found that intermediate phenotypes 
could be produced by gradually adding ectodysplasin A (EDA) protein in culture to tooth explants carrying a null mutation 
in the tooth-patterning gene Eda. By identifying development-based character inter-dependencies, we show how to 
predict morphological patterns of teeth among mammalian species. Finally, in vivo inhibition of sonic hedgehog signalling 
in Eda null teeth enabled us to reproduce characters deep in the rodent ancestry. Taken together, evolutionarily infor- 
mative transitions can be experimentally reproduced, thereby providing development-based expectations for character- 


state transitions used in evolutionary studies. 


In the case of extinct mammals, a large number of dental features are 
used as characters in phylogenetic analyses’ *, and these characters often 
provide the key evidence for evolutionary inferences due to the pre- 
ponderance of teeth in the fossil record. For reliable phylogenetic infer- 
ences, characters have been typically considered to be independent from 
each other’ *. Although developmental factors can make characters 
dependent*"', thorough analyses of the influence of development on 
character state changes are lacking. To approximate changes relevant 
to evolutionary transitions, experiments that tune morphology gradu- 
ally are required. These kinds of experiments are also useful to evaluate 
how, and whether, continuous changes in underlying developmental or 
genetic parameters map to continuous changes in the phenotype’**. 

Here we investigated whether gradual alterations of tooth develop- 
ment can produce gradual changes in the phenotype, and whether these 
changes reflect known evolutionary transitions. We focused on the devel- 
opment of the rodent dentition, using mice carrying a spontaneously 
occurring null mutation in ectodysplasin (Eda) as a starting point. This 
mutation was chosen because the effects of Eda on tooth morphology 
are relatively subtle, causing simplification of dental morphology with- 
out complete loss of teeth'*’°, however, the Eda mutation causes changes 
in many characters and is thus highly informative’®. 


Experimental tuning of morphology 

Wereasoned that, to approximate evolutionary transitions, fine-tuning 
of EDA signalling would be required. We tracked gradual changes dur- 
ing development by crossing Eda null mice with mice that express green 
fluorescent protein (GFP) from the Shh locus (hereafter called SahGFP 
mice’). The epifluorescence of ShhGFP mice can be used to monitor 
tooth cusp development because Shh is initially expressed in the enamel 
knots, which are the epithelial signalling centres that form at the positions 
of future cusps’’. Later during differentiation, Shh expression spreads 


throughout the inner enamel epithelium, enabling the visualization of 
the overall crown shape. 

First, we used EDA protein in culture at increasing concentrations 
(n = 9 to 16 in each group, Supplementary Table 1, Methods) to test 
whether the Eda null morphology could be engineered to gradually resem- 
ble wild-type morphology. We cultured first lower molars starting at 
embryonic day 13, just before crown formation begins, and EDA pro- 
tein was administered into the culture media at days zero and two. This 
treatment scheme restored EDA signalling during the period of first 
molar cusp patterning. At this stage, Eda is thought to regulate the size 
and signalling of enamel knots"®, which in turn give rise to tooth cusps. 

The EDA protein treatments restored the wild-type mouse cusp pat- 
tern in culture (Fig. 1a), in agreement with previous experiments'*””. We 
next examined the mode of cusp appearance in detail by analysing daily 
time-lapse images of the cultured teeth. The results showed that increas- 
ing dosage of EDA caused a heterochronic shift in cusp initiation (Fig. 1a). 
Specifically, some of the cusps were initiated earlier (predisplaced) as 
EDA concentration was increased (Fig. la, Supplementary Table 1). Fur- 
thermore, the time-lapse data showed that increasing EDA concentration 
enlarged the primary enamel knot, which in turn increased the number 
of cusps (Fig. 1b, Extended Data Fig. 1). The link between the primary 
enamel knot size and cusp number (Fig. 1b) indicates that the overall 
size of the tooth crown needs to reach certain thresholds to accommo- 
date additional cusps. From a developmental signalling point of view, a 
heterometric”® change in the dosage of EDA signalling can lead to a hete- 
rochronic shift in timing of cusp initiation. 


Computational modelling of patterning 

Computational modelling has been used to simulate tooth shape develop- 
ment and evolution’*”’, and the new developmental data that we obtained 
allowed us to link models and experiments in unprecedented detail (see 
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Figure 1 | Gradual dosage effects of EDA on Eda null mutant first lower 
molars (m1). a, ShhGFP X Eda null tooth development is rescued by EDA, 
with higher concentrations reproducing wild type (WT) development (Eda 
null: 1 = 15;10ngml-':1 = 15;50ngml_':n = 13; WT: n = 16;all teeth listed 
in Supplementary Table 1: n = 113). Initiation of different parts of the tooth 
crown is shifted earlier with higher EDA concentrations, such as the 
anteroconid (black arrowheads) and the hypoconulid (white arrowheads). 

b, Primary enamel knot size at culture day 2 predicts the number of cusps at 
day 7. Eda null teeth treated with 10 ng ml‘ (open circles) and 50 ng ml 
(open diamonds) fill in the phenotypic gap between the Eda null (black circles) 
and wild-type (black diamonds) teeth. Anterior is towards the left in a. Scale 
bar, 500 pum. 


Methods). To model the experimental transitions, we implemented a 
morphodynamic model, which integrates signalling and tissue growth 
to simulate tooth development”’, in the new ToothMaker interface (Ex- 
tended Data Fig. 2). First we modelled a wild-type mouse tooth mor- 
phology corresponding to our cultured teeth (see Methods). Then, by 
progressively adjusting (mutating) each parameter separately, we sim- 
ulated the effects of gradual changes in signalling (Extended Data Fig. 3). 
The in silico simulation reproduced the fusion of cusps both on the 
talonid and on the trigonid of the Eda null model, as well as rescue of 
the separate cusps observed in the in vitro experiment (Fig. 2a, Extended 
Data Fig. 4). Moreover, as predicted from the experimental observa- 
tions, the full range of transitions from fused trigonid to separate cusps 
was replicated by varying the activator parameter that induces the for- 
mation of enamel knots (Fig. 1a, Extended Data Fig. 4). 

The comparable patterns in the experiments (Fig. 1) and the model- 
ling (Fig. 2) underscore the dynamic nature of tooth shape development, 
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Figure 2 | Computational modelling of gradual changes in signalling on 
cusp patterns. a, Computer simulations using ToothMaker (Extended Data 
Fig. 2) of first lower molar development show appearance of cusp areas 

(red colour). Larger values of activator (Act) parameter increase cusps. 

b, Modelled wild-type mouse pattern is largely retained until Act = 0.5. Smaller 
Act values quickly reduce cusps resulting in modelled teeth that resemble Eda 
null teeth. c, Tabulated data from culture experiments (Supplementary Table 1) 
show an abrupt change in cusps at low levels of EDA protein (2.5 to 

10 ng ml’), similar to modelling data (b). Increasing EDA concentration 
does not increase cusps beyond wild type (WT). Error bars denote s.d. 
Anterior is towards the left in a. 


and indicate that the identification of cusp homologies should rely on 
topological correspondences rather than unique, cusp specific gene expres- 
sion patterns. Furthermore, both the model simulations and the experi- 
ments showed a rapid phenotypic response at low levels of activator 
(Fig. 2b) and EDA signalling (Fig. 2c), respectively, with smaller changes 
observed at higher levels of signalling. These results imply a potential 
disjunction between rates of evolution measured in the phenotype and 
in gene expression level, although by varying more than one parameter 
at a time, a multivariate linear relationship between expression levels 
and phenotypes remains possible'*’*"”. 


Detailed analyses of character states 


To examine the full range of morphologies produced in the cultured 
explants, we tabulated the character states comparable to the ones used 
in evolutionary studies for each crown region (see Methods and Sup- 
plementary Table 1). Although the tooth culturing system does not pro- 
duce mineralized features, we were able to tabulate the presence or 
absence of cusps (Fig. 3a) and relative height of the talonid (Fig. 3b) 
with high resolution. 

First, the trigonid cusps, the protoconid and the metaconid, are fre- 
quently fused in Eda null teeth (40% of explants, Fig. 3c, Supplemen- 
tary Table 1). The separation of the protoconid and the metaconid in 
2.5ngml | EDA treated teeth reflects a transition that would predate 
the evolution of the pretribosphenic mammalian pattern’’”’. 

Next, the talonid, which in Eda null teeth is a shallow shelf lacking 
well-defined cusps, was already affected in the lowest, 2.5 ngml_' EDA 
treatments by an increase in height, and in the second lowest 10 ng ml ' 
treatments by acquisition of additional cusps (Fig. 3c, Supplementary 
Table 1). These treatments, however, caused polymorphic effects (Fig. 3c). 
In 47% of 10 ng ml ' explants, the shallow shelf gave rise to a single cusp, 
whereas in the remaining explants two distinct cusps formed. These two 
talonid cusps correspond to the hypoconid and the entoconid cusps in 
wild-type teeth, and both cusps formed consistently starting at 100 ng ml 
EDA (Fig. 3c). 

During the evolution of tribospheny in Mesozoic mammals, the 
functional, three cusped talonid was added to the posterior end of the 
trigonid''”*. This originated in Triassic non-mammalian synapsids where 
a single cusp, often located on the cingulid, was appended posteriorly 
to the basal three-cusped trigonid morphology’. Although the hypo- 
conid is generally agreed to have evolved before the entoconid”, our data 
do not allow determination of whether the single cusp in the talonid of 
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Figure 3 | Differential sensitivities of tooth crown regions to EDA. a, Range 
of tooth morphologies and cuspal character states tabulated at the end of the 
cultures (day 13 + 7). Character state numbers above first lower molar images 
correspond to the number of cusps present in the respective region of the crown 
(Supplementary Table 1). The first trait is for the anteroconid, the second for 
the trigonid, the third for the talonid and the fourth for the hypoconulid. 


cultured teeth is the hypoconid or the entoconid. The central location 
of the single talonid cusp in our cultured teeth (Figs 1a and 3a) does, 
however, suggest that the effects of EDA, at least partially, mimic the 
early steps of talonid evolution. 

Finally, the anteroconid and the hypoconulid appear already at 10- 
50ngml ‘EDA, but unlike in vivo, their wild-type character states remain 
polymorphic in cultured molars irrespective of treatments (shown by 
hatched colouring in Fig. 3c). Evolutionarily, these crown features appear 
in early rodents and are present in the basal murines**”*”*. The hypo- 
conulid, however, has been lost in many murine lineages, and this evo- 
lutionary lability appears to be reflected in the developmental data. 

Taken together, these data show that even though dental characters 
used in evolutionary analyses may be highly pleiotropic, as shown 
previously"°, transformations of character states can occur at different 
thresholds of signalling (Fig. 3c). In contrast to the relatively robust trig- 
onid cusps, large variation in talonid structure can be reproduced by 
small changes of EDA signalling. This has major implications for the 
evolution of tribosphenic mammals that are diagnosed by their derived 
talonid features’''”, as discussed below. 


Testing the experimental predictions 


To test the development-based predictions of evolutionary patterns in 
the talonid, we examined the link between talonid height and cusp num- 
ber across mammalian species. We first tabulated relative talonid height 
(as percentage of the trigonid height) and cusp numbers in the entire 
posterior region of the tooth (the entoconid, the hypoconid and the 
hypoconulid; see Methods). The results show that talonid height and 
cusp number are linked developmentally (Fig. 4a, Extended Data Table 1), 
even though these traits are typically treated as independent characters 
in evolutionary analyses. In the macroevolutionary context of rodents, 
the patterns obtained in the experiments appear to bridge the derived 
state we analysed in 35 species of extant murine rodents, already pres- 
ent in the extinct Miocene early murines Potwarmus and Antemus*”, 

and the basal morphology found in Tribosphenomys minutus, a Paleo- 
cene mammal that is considered to be a basal rodentiaform’, or the imme- 
diate sister taxon of Glires* (Fig. 4a, Extended Data Table 1). 

Because our experimental morphologies extend even beyond those 
found in rodents and towards further reduced talonids (Fig. 4a), we 
also measured talonid height and cusp number in 32 species of extant 
carnivorans. The first lower molar, or carnassial, of carnivorans shows argu- 
ably the fullest range of talonid morphologies among extant mammals”. 
The correlated change in carnivoran talonid height and cusp number 
(Fig. 4b, Extended Data Tables 2 and 3) is reminiscent of the patterns 
found in experiments on the mouse (Fig. 4a). Furthermore, this rela- 
tionship between talonid height and morphology was retained when 
we replaced cusp number with talonid complexity using orientation patch 
count (OPC, Fig. 4c). In contrast, the trigonid morphology remained 
relatively constrained (Extended Data Table 2). OPC, which is calcu- 
lated as the number of discrete surfaces distinguished by differences in 
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b, Talonid height characters are tabulated as the height of the talonid (white 
arrowhead) relative to the trigonid (black arrowhead). c, Differential 
sensitivities of main regions of tooth crown to EDA show how different parts of 
the crown have varying trait sensitivities to EDA. For the full range of data, see 
Supplementary Table 1. 


orientation, has been shown to increase across the dietary spectrum 
from carnivores to omnivores to herbivores in extant mammals”””’. 
Therefore, even though carnivoran dental diversity is driven by ecolo- 
gical and functional factors, development may have an influence on 
which parts undergo adaptations more easily. 

Taken together, the same developmental cascade, starting from the 
trigonid, may have contributed both to the initial evolution of the tal- 
onid at the base of mammalian evolution and to dental morphological 
diversification during mammalian radiations. To a lesser degree, the 
same pattern may hold for the anterior end of the teeth, and it is con- 
ceivable that an analogous developmental cascade to the one that pro- 
duced the talonid also produced the anterior expansion of the crown 
in pseudo-tribosphenic mammals”. A cascading system of activation- 
inhibition between teeth has been proposed to regulate molar propor- 
tions in mammals”, and in general, much of the evolutionary history of 
mammalian dentitions may consist of tinkering”’ with this pre-existing 
developmental program. 


Retrieving ancestral character states 

Despite the overall agreement between experimental and evolutionary 
patterns, there were small but important differences when compared 
with the details of rodent evolution. Most notably, the Eda null teeth had 
fused cusps, which differs from what is found in basal rodentiaforms 
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Figure 4 | Testing developmental predictions on evolutionary patterns. 

a, Relative height of the m1 talonid (measured as percentage of the trigonid 
height) in each treatment (open circles, error bars denote s.e.m.) correlates with 
the number of cusps in the talonid. The experimental data bridges the 
corresponding values for murine rodent species (n = 35, black circle with s.e.m) 
and for a basal rodentiaform Tribosphenomys minutus (black diamond). The 
reduced major axis regression slope for the experimental data including 
untreated Eda null and wild-type teeth is 3.30 and the intercept is — 1.79 

(?° = 0.891) and the corresponding values for only explants with EDA are 4.22 
and —1.63 (1° = 0.946). We note that these slopes can be considered 
underestimates due to the variably present hypoconulid cusps in cultured 
wild-type mouse teeth. b, c, The first lower molars of carnivoran species 

(n = 32) show correlated changes in the talonid height and cusp number 

(b) and the talonid height and talonid complexity (c) measured using OPC. The 
reduced major-axis regression slope for the graph in b is 6.53 and the intercept 
is —1.38 (7° = 0.594), and for the graph in c it is 48.24 and the intercept is 
—10.93 (7° = 0.562). For data and details see Methods and Extended Data 
Tables 1-3. 
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and close relatives**. This difference in morphology indicated that, in 
addition to EDA signalling, other pathways needed to be adjusted in 
order to produce the evolutionarily basal morphologies. To address 
this issue, we considered reducing the required cusp spacing, which 
can be experimentally adjusted beyond the normal mouse pattern by 
modulating multiple signalling pathways’’. We therefore next set out 
to engineer mouse teeth that would have additional characters of basal 
rodents (see Methods). 

First, we modelled the reduction in the cusp spacing by decreasing 
inhibition in the simulated Eda null teeth, which resulted in formation 
of multiple cusps (Extended Data Fig. 5). Next, because SHH has been 
shown to inhibit cusp formation by regulating cusp spacing’’”’, we 
cultured Eda null samples with a SHH inhibitor, thereby inhibiting the 
inhibitor. This treatment also circumvents the tendency of EDA to cause 
the formation of crests or lophs between cusps’®'**’, which are found 
in evolutionarily derived rodents. The experimental results validated 
the in silico model simulations: inhibition of SHH signalling in cultured 
Eda null teeth caused the development of more distinct cusps, without 
eliminating evolutionarily basal features of Eda null teeth such as the 
height difference between trigonid and talonid (Fig. 5a, Extended Data 
Fig. 6). 

To push the experimental system further and to retrieve features pres- 
ent in the basal taxon Tribosphenomys’, we inhibited SHH in developing 
Eda null teeth in vivo (see Methods). Tribosphenomys cusps are colum- 
nar and well separated, lacking a crest called the metalophid (the trigonid 
wall) that connects cusps together. The loss of the metalophid has been 
linked to the basal rodentiaforms’, but it has reappeared in many rodent 
clades, including murines**”*”*. Our in vivo engineered tooth shapes 
showed columnar and laterally separated cusps without the metalophid 
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Figure 5 | Engineering mouse teeth to have basal character states. a, Eda null 
teeth cultured with SHH antagonist (n = 9 of 11 teeth) show more and 
better separated cusps (arrowheads, compare to Fig. 1a). b, Second molars 
produced using in vivo treatment of Eda null mice with SHH antagonist 
particularly show better separation of cusps compared to the Eda null teeth 
(n = 2 of 4 mice). Tribosphenomys minutus teeth lack crests connecting cusps 
(specimen V10776 shows p4-m3). c, Obliquely posterior views of molars show 
the lack of the metalophid crest (arrowheads) in treated Eda null, which has 
replicated the ancestral morphology of T. minutus. The Tribosphenomys molars 
shown are the first (V10776, on the left) and the second molar (V10775 
holotype, on the right). Teeth have been mirrored if needed to represent the left 
side, anterior is towards the left in a and b and top in c. Scale bars, 500 um. 
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(Fig. 5b), a morphology visible also at the enamel-dentin junction (Ex- 
tended Data Fig. 7). Although the effects of SHH inhibition were vari- 
able, the trough separating the protoconid and metaconid approximated 
the pattern found in Tribosphenomys (Fig. 5c). These results indicate 
that, with a relatively small number of developmental changes, mouse 
teeth can be engineered to express evolutionarily basal traits. 


Conclusions 


Our results demonstrate that many of the step-wise transitions that 
are widespread in the fossil record of mammalian teeth are reproduc- 
ible experimentally. Whereas our results suggest that several, if not the 
majority, of dental traits are developmentally linked”®, individual char- 
acters may respond to different levels of the same signal. These thresh- 
olds may well underlie different morphological gradients that have been 
identified along the tooth row™. Moreover, trait thresholds may affect 
evolutionary rates of individual traits differently. Such data in turn should 
be useful when combined with analyses of character correlations*, and 
in weighting or ordering characters, or objectively assigning transition 
weights within characters coded to minimize character dependency effects 
in phylogenetic analyses®. Other developmental factors and signalling 
pathways may influence traits in other ways, but we predict that the 
general pattern of results will hold as long as the factors affect the sig- 
nalling dynamics of the enamel knots. Finally, as recently proposed for 
the evolution of bird and non-avian dinosaur skulls**, developmental 
data can suggest novel insights into the processes underlying hetero- 
chrony. A better mechanistic basis for heterochrony will help to explain 
changes in evolutionary rates, including prediction of intermediate 
morphologies even when they have not yet been recovered in the fossil 
record. In general, with advancing understanding of development, it 
will be possible to experimentally test many more of the known evolu- 
tionary transitions. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Structure of the DDB1-CRBN E3 ubiquitin 
ligase in complex with thalidomide 
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William C. Forrester*, Markus Schirle*, Ulrich Hassiepen®, Johannes Ottl°, Marc Hild*, Rohan E. J. Beckwith’, J. Wade Harper’, 


Jeremy L. Jenkins* & Nicolas H. Thoma! 


In the 1950s, the drug thalidomide, administered as a sedative to pregnant women, led to the birth of thousands of chil- 
dren with multiple defects. Despite the teratogenicity of thalidomide and its derivatives lenalidomide and pomalidomide, 
these immunomodulatory drugs (IMiDs) recently emerged as effective treatments for multiple myeloma and 5q-deletion- 
associated dysplasia. IMiDs target the E3 ubiquitin ligase CUL4-RBX1-DDB1-CRBN (known as CRL4©®®") and promote the 
ubiquitination of the IKAROS family transcription factors IKZF1 and IKZF3 by CRL4“"2, Here we present crystal structures 
of the DDB1-CRBN complex bound to thalidomide, lenalidomide and pomalidomide. The structure establishes that CRBN 
is a substrate receptor within CRL4“"®" and enantioselectively binds IMiDs. Using an unbiased screen, we identified the 
homeobox transcription factor MEIS2 as an endogenous substrate of CRL4°"®. Our studies suggest that IMiDs block 
endogenous substrates (MEIS2) from binding to CRL4°®®" while the ligase complex is recruiting IKZF1 or IKZF3 for deg- 
radation. This dual activity implies that small molecules can modulate an E3 ubiquitin ligase and thereby upregulate or 


downregulate the ubiquitination of proteins. 


Thalidomide («-(N-phthalimido)glutarimide) was introduced to the 
market in 1954 by the company Chemie Griinenthal. Initially promoted 
as a sedative with anti-emetic properties’”, it became popular for treating 
‘morning sickness’. In 1961, thalidomide taken in the first trimester of 
pregnancy was implicated in frequent limb deformities in infants*’. Between 
8,000 and 12,000 affected children were born before the drug was banned. 
Interest in thalidomide revived in 1965, when it was shown to have immu- 
nomodulatory and anti-inflammatory properties in patients with eryth- 
ema nodosum leprosum, an inflammatory complication of leprosy’. In 
1994, thalidomide was found to inhibit the basic fibroblast growth factor 
(bFGF)-induced formation of new blood vessels’. These findings prompted 
clinical trials exploring thalidomide use for anti-angiogenic cancer therapy. 
The efficacy of thalidomide and its derivatives lenalidomide and poma- 
lidomide (collectively known as IMiDs) has since been demonstrated 
for several haematological cancers*: newly diagnosed multiple myeloma 
(thalidomide)’, refractory multiple myeloma (lenalidomide or pomalido- 
mide) and 5q-deletion-associated myelodysplastic syndrome (lenalidomide). 

The target of thalidomide, cereblon (CRBN), is a ubiquitously expressed 
protein that is part of the cullin-4-containing E3 ubiquitin ligase complex 
CUL4-RBX1-DDBI1 (known as CRL4)’°. Mutations in CRBN are assoc- 
iated with autosomal recessive, non-syndromic mental retardation". In 
myeloma cells, the anti-proliferative activities of IMiDs are linked to CRBN 
expression'’*”’, making IMiDs the first clinically approved drug targeted 
at E3 ubiquitin ligases with specificity for CUL4-RBX1-DDB1-CRBN 
(CRL4A®X)2. The anti-proliferative and immunomodulatory effects of 
IMiDs have recently been linked to drug-induced ubiquitination and deg- 
radation of the transcription factors IKZF1 (also known as IKAROS) and 
IKZF3 (also known as AIOLOS) by CRL4@®®N (refs 14-16). Accordingly, 
loss of CRBN is a common determinant of drug resistance in myeloma 
cells'”. How IMiD binding affects CRL4®"N at the molecular level remains 
unclear. We set out to examine the role of CRBN within the E3 ubiquitin 


ligase complex CRLA*8N 


ligase activity. 


, characterizing the effect of IMiD binding on 


Structure of DDB1-CRBN bound to IMiDs 


Wecrystallized a chimaeric complex ofhuman DDB] and Gallus gallus 
(chicken) CRBN bound to thalidomide (refined to 3.0 A), lenalidomide 
(3.0 A) and pomalidomide (3.5 A) (Fig. 1a, b and Extended Data Table 1). 
The high level of sequence conservation between human and chicken 
CRBN (Extended Data Fig. 1a, b) allows structural insight into human 
CRBN to be inferred directly from G. gallus CRBN. All subsequent bio- 
chemical and cell-biological experiments were performed with full-length 
human proteins. G. gallus CRBN consists of three sub-domains (Extended 
Data Fig. 2a-f): a seven-stranded B-sheet located in the amino-terminal 
domain (NTD, residues 1-185) (Extended Data Fig. 2a), a seven o-helical 
bundle domain (HBD, residues 186-317) involved in DDB1 binding 
(Fig. 1c), and a carboxy-terminal domain composed of eight B-sheets 
(CTD, residues 318-445) (Fig. 1b). DDB1 comprises three seven-bladed 
WD40 £-propellers (BPA, BPB and BPC) arranged in a triangular fashion”, 
with G. gallus CRBN attaching to a cavity between the BPA and BPC 
propellers (Fig. 1c). The molecular basis of the HBD-mediated attach- 
ment of G. gallus CRBN to DDB] defines a novel class of DDB1 binders and 
differs in detail from previously reported DDB1 attachment modules’? ”° 
(Extended Data Fig. 2e, f). 

The G. gallus CRBN N-terminal region (residues 46-317), including 
the NTD and HBD, resembles the N-terminal domain of bacterial Lon 
proteases (Protein Data Bank (PDB) ID 3LJC; root mean squared devi- 
ation (r.m.s.d.), 2.7 A over 178 residues aligned) (Extended Data Fig. 2b). 
The CTD harbours the thalidomide-binding pocket and contains a con- 
served Zn?" -binding site, which is situated approximately 18 A from the 
IMiD (Fig, 1a, b). The Zn” ion is coordinated through conserved cysteine 
residues 325, 328, 393 and 396. The G. gallus CRBN-CTD shares structural 
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Figure 1 | The overall structure of the DDB1-CRBN complex. a, Cartoon 
representation of the structure of the complex of human DDB1, G. gallus CRBN 
and thalidomide: DDB1, highlighting the domains BPA (red), BPB (magenta), 
BPC (orange) and DDB1-CTD (grey); G. gallus CRBN, highlighting the 
domains NTD (blue), HBD (cyan) and CTD (green). The Zn?* ionis drawn as 
a grey sphere. b, As in a, with thalidomide shown as a yellow stick structure. 
A close-up showing that all IMiDs occupy a common binding site on CRBN 
(solid boxed area; red, oxygen; blue, nitrogen) and a close-up of the overall 
G. gallus CRBN-CTD architecture (dashed boxed area) are shown. ¢, G. gallus 
CRBN-HBD helices and their interactions with DDB1. 


similarity with the pseudouridine synthase and archaeosine transgly- 
cosylase (PUA) fold family”’, which are involved in binding diverse sets 
of ligands (Extended Data Fig. 2c, d). 


IMiD binding to CRBN 

Thalidomide, lenalidomide and pomalidomide (Fig. 2a—c and Extended 
Data Fig. 3a-i) bind a pocket on G. gallus CRBN-CTD (Fig. 1b) situated 
in a surface groove that is highly conserved across CRBN orthologues 
(Extended Data Fig. 1b). The three ligands superimpose with very little 
deviation in the «-(isoindolinone-2-yl) glutarimide moiety, which con- 
tributes the majority of interactions between the receptor and the com- 
pounds and is the main pharmacophore”. The glutarimide group is held 
in a buried cavity between G. gallus CRBN sheets 810 and B13 (Fig. 2d). 
The glutarimide carbonyls (C2 and C6) and the intervening amide (N1) 
are in hydrogen-bonding distance with G. gallus CRBN residues His 380 
and Trp 382, respectively (Fig. 2c, d). A delocalized lone pair connects 
the glutarimide nitrogen with the two glutarimide carbonyls (C2-N1-C6) 
and is coplanar with Trp 382. The opposing aliphatic face of the glu- 
tarimide ring (C3, C4 and C5) is in tight van der Waals contact with a 
hydrophobic pocket lined by Trp 382, Trp 388, Trp 402 and Phe 404. 
In vitro, mutations of Tyr 386 (which affect the integrity of the binding 
pocket) and Trp 388 (which is directly involved in compound binding) 
to alanine ablate the binding of all three IMiDs to CRBN" (Extended 
Data Fig. 4a-c). Mutations of the equivalent residues render CRL4“°?N 
insensitive to the presence of thalidomide or lenalidomide in vivo 
In addition to the almost identical binding modes, we found that thalid- 
omide, lenalidomide and pomalidomide have similar affinities for CRBN, 
with dissociation constants of ~250 nM, ~178 nM and ~157 nM, respec- 
tively (Extended Data Fig. 4d-h). 
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Figure 2 | IMiD binding to CRBN. a, Chemical structure of lenalidomide. 
b, Chemical structure of pomalidomide. c, Sketch of thalidomide and its 
interactions with G. gallus CRBN. Hydrogen bonds are shown as dashed lines, 
and hydrophobic interactions are indicated as green semicircles. d, IMiDs are 
anchored through hydrogen bonding of the glutarimide moiety to G. gallus 
CRBN His 380 and Trp 382, as well as through the aliphatic face of the 
glutarimide being engulfed in a hydrophobic cage. e, Surface representation of 
G. gallus CRBN (grey) and (S)-lenalidomide shown as a yellow stick structure, 
together with its positive mF, — DF, electron density map (o = 3.5) shown in 
green. The fit of the (S)- and (R)-enantiomers is indicated. 


Thalidomide differs from lenalidomide and pomalidomide in the C4 
phthalimide aniline functionality (Fig. 2a-c), which is found in a solvent- 
exposed position. The common carbonyl at the phthaloyl C1 position 
contributes a water-mediated hydrogen bond to His 359, which anchors 
the phthaloyl ring system together with stacking interactions provided by 
the aliphatic face of Pro 354 (Fig. 2d). The phthalimide C5 and C6 positions 
are fully solvent exposed. The overall shape of the buried IMiD-binding 
pocket favours binding of the (S)-enantiomer over the (R)-enantiomer 
(Fig. 2e), which is in agreement with in vivo experiments’”. 


CRBN functions as a DCAF for the ligase CRLAC®®N 


Within the CRL4 family of ligases, DDB1 functions as the adaptor con- 
necting the substrate receptor to the ligase CUL4 (refs 17, 19, 23). More 
than a dozen substrate receptors, including CRBN, have been identified 
(and these are designated DCAFs for DDB1- and CUL4-associated fac- 
tors). G. gallus CRBN, despite lacking the canonical DCAF WD40 fold, 
resembles a substrate receptor in its dimensions and position on DDB1 
(Extended Data Fig. 5a, b). The thalidomide-binding site is situated where 
substrates generally bind to WD40 DCAFs (see, for example, DDB2 
engaging DNA (Fig. 3a and Extended Data Fig. 5a, b)). The equivalent 
residues in the structurally related PUA-domain-containing proteins are 
directly engaged in ligand binding (Extended Data Fig. 5c-e). The PUA 
domain of CRBN has striking structural similarity to a member of the 
methionine sulphoxide reductase family (Extended Data Fig. 2c, d). In 
G. gallus CRBN, the now defunct active centre of the reductase inter- 
acts with IMiDs. Truncation of the CRBN C terminus bordering the 
conserved thalidomide-binding pocket has been found in non-syndromic 
mental retardation (see Extended Data Fig. 6a—c for analysis of CRBN 
mutations). 

Within CUL4-RBX1-DDB1-DCAF complexes, CUL4 was found 
to freely rotate up to 150° around DDBI (refs 24, 25) (Fig. 3a, b). CUL4 
mobility is undirected and is driven exclusively by Brownian motion. Given 
the strictly modular architecture of the CRL4 family”**, the structure 
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Figure 3 | CRBN is a substrate receptor in the ligase CRLA°°®%, 

a, Architecture of the CRL4??** complex bound to DNA (PDB ID 4A0K). 
b, Model of CRL4°®®N bound to thalidomide. c, Firefly luciferase (Fluc) to 
Renilla luciferase (Rluc) ratios (Fluc:Rluc) of IKZF1-reporter-plasmid- 
transfected HEK 293T cells following incubation with the indicated 
thalidomide derivatives. The data are presented as mean = s.e.m. (n = 4). Cpd, 
compound; Len, lenalidomide; Pom, pomalidomide; Thal, thalidomide. 


of CRL4°®®N can be predicted with high confidence (Fig. 3b). Rotation 
of the ligase arm of CUL4 around DDB] and CRBN establishes a ubiqui- 
tination zone with dimensions of up to 340 AX110AX30A (Fig. 3b), 
with the centre of rotation near the thalidomide-binding site. The CRL4 
ligase is promiscuous, targeting lysines that cross this ubiquitination 
zone. Accordingly, we observed that CRBN is autoubiquitinated in vitro 
(Extended Data Fig. 7a—d) at the unstructured N-terminal tail”® (resi- 
dues Lys 39 and Lys 43; see also Extended Data Fig. 7d). We found that 
CRBN autoubiquitination persisted in the presence of IMiDs in vitro 
(Extended Data Fig. 7b) and was subject to inhibition by the Cop9 
signalosome (CSN) (Extended Data Fig. 7b, c). 

Pomalidomide and lenalidomide effectively target the IKAROS tran- 
scription factors IKZF1 and IKZF3 for degradation by CRL4“™. Tha- 
lidomide, by contrast, is here less efficient'*"'® (Fig. 3c). All three IMiDs 
have comparable affinities for CRBN (Extended Data Fig. 4e-h) and sim- 
ilar predicted membrane permeabilities (Extended Data Fig. 7e). The major 
structural difference between CRBN-bound lenalidomide and poma- 
lidomide and CRBN-bound thalidomide lies in the presence of the 
solvent-exposed C4 aniline functionality in lenalidomide and pomali- 
domide. Therefore, different functional groups at the phthaloyl C4, C5 
and C6 positions were tested for their ability to degrade IKZF1 in a dual 
luciferase reporter assay'* (see Supplementary Methods). Small groups 
attached to C4 of thalidomide (NH, CH; and to some extent Cl) pro- 
moted IKZF1 degradation (Fig. 3c). By contrast, larger substituents at 
the C4, C5, or C4 and C6 positions were less effective at promoting IKZF1 
degradation. These modifications are not expected to affect CRBN bind- 
ing, as even a large substituent at the C4 position had no adverse effect on 
affinity (Extended Data Fig. 4d). Assuming comparable cellular concen- 
trations”, this finding indicates that solvent-exposed bulky groups at C4 
(and methyl groups at C5 and C6) interfere with IKZF1 binding. The 
aniline and methyl substituents at C4, however, appear to contribute to 
IKZF1 degradation, probably through direct interactions. 
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MEIS2 is an endogenous ligase substrate of CRL4C®2®N 


Next, we set out to examine the effects of IMiDs on endogenous CRBN 
substrates, which have so far remained elusive. An unbiased biochemical 
screen was established to identify proteins that are ubiquitinated by 
CRL4°®®N, Human protein microarrays (~9,000 proteins) were used 
for on-chip ubiquitination by the complex CUL4A-RBX1-DDB1-CRBN 
(CRL4A°*29) in the presence of El (UBA1), E2 (UBCHS5A), ubiquitin 
(biotin-ubiquitin) and ATP (Extended Data Fig. 8a—-e). We reasoned 
that a substrate would be subject to ubiquitination by CRL4A“™N but 
not by a control ligase (CRL4A°?”), that a substrate would overcome 
inhibition of CRL4AC®®N by CSN” and that ubiquitination of such 
substrates would be inhibited by lenalidomide. Following cluster ana- 
lysis (43 clusters), we identified one cluster that best matched the expected 
ubiquitination profile (see Supplementary Methods for details). In an 
orthogonal screen, the top five candidate genes (GRINL1A, MBOAT7, 
OTUD7B, Coorf141 and MEIS2) were overexpressed in HEK 293T cells 
and assessed for lenalidomide-induced changes in steady-state levels 
(Extended Data Fig. 8f). Of these factors, we focused on MEIS2, a tran- 
scription factor that has been implicated in various aspects of human 
development*”~?, which we found was stabilized on lenalidomide treat- 
ment (see below). 

To recapitulate the on-chip results in solution, insect-cell-derived 
MEIS2 was used to establish a fully recombinant CRL4°°"% ubiquitina- 
tion system. MEIS2 expressed from insect cells was isolated in a hyper- 
phosphorylated form. Dephosphorylation was found to improve protein 
behaviour, resulting in increased MEIS2 ubiquitination using lysine-free 
(KO) ubiquitin (Fig. 4a, compare lanes 1 and 2). MEIS2 ubiquitination was 
also observed using phosphorylated MEIS2 (data not shown). MEIS2 
ubiquitination by CRL4“N was subject to inhibition by various IMiDs 
(Fig. 4a, lanes 4-8, and Extended Data Fig. 9a), irrespective of the chemical 
substituent at the C4, C5 or C6 position. A CRL4“®°N mutant carrying the 
Tyr384Ala and Trp386Ala substitutions (CRBN*/“4) that rendered the 
receptor unable to bind IMiDs (Extended Data Fig. 4a—c) also impaired 
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Figure 4 | MEIS2 is an endogenous substrate of the ligase CRL4°°™%, 


a, CRL4CRBN ubiquitinates MEIS2 in vitro (lane 2), a reaction inhibited by 
thalidomide (Thal, lane 4), pomalidomide (Pom, lane 8), compound (Cpd) 7 
(lane 5), compound 8 (lane 6) and compound 4 (X, extended carbon linker) 
(lane 7). Lysine-free (Ko) ubiquitin (Ub) was used to obtain a defined band-shift 
detected by anti-MEIS2 immunoblotting. b, SK-N-DZ cells were pretreated 
with 10 1M lenalidomide or DMSO before addition of 100 pg ml! CHX for 
the indicated times. MEIS2 and CRBN levels were detected using anti- MEIS2 
and anti-CRBN immunoblotting. Histone H4 served as a loading control. 

c, M059J cells were transfected with one of four short interfering RNA (siRNA) 
constructs (labelled A to D) targeting CRBN, or a negative control (Neg. 
siRNA2), and the levels of endogenous CRBN and MEIS2 proteins were 
monitored. The asterisk indicates a non-specific band, and ERK2 served as a 
loading control. d, Treatment of M059J cells with the indicated amounts of 
IMiDs, MLN4924 or bortezomib (Btz) resulted in an increase in the steady- 
state MEIS2 protein levels after 4h. The data are presented as mean + s.e.m. 
(IMiDs, n = 3; MLN4924 and Btz, n = 2). 
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MEIS2 ubiquitination (Extended Data Fig. 9b). In accordance with our on- 
chip results, we found that MEIS2 can relieve the inhibition of CRLACRBN 
by CSN (Extended Data Fig. 9c) and is not ubiquitinated by the control 
ligases CRL4** and CRL4“?? (Extended Data Fig. 9d). These data sug- 
gest that, within the CRLACRBN complex, CRBN targets MEIS2 for ubiqui- 
tination in vitro with the aid of its IMiD-binding site. 

We next sought to test the effect of lenalidomide on MEIS2 half-life 
in cells by performing cycloheximide (CHX) chase experiments. RNA- 
seq data from The Cancer Genome Atlas (TCGA) identified the neu- 
roblastoma cell line SK-N-DZ as having high levels of endogenous 
MEIS2 and CRBN transcripts. Following treatment with 100 pg ml * 
CHX, we found that MEIS2 protein levels were largely depleted after 
3h (Fig. 4b, lanes 1-4). The addition of 10 uM lenalidomide stabilized 
MEIS2 protein levels under these conditions (Fig. 4b, lanes 5-8). As SK- 
N-DZ cells were recalcitrant to transfection, we subsequently employed 
M059J cells, which are similarly characterized by high levels of MEIS2 
messenger RNA (TCGA), to examine the effects of CRBN depletion by 
RNA interference on MEIS2 abundance. For all of the experiments, the 
endogenous MEIS2 levels were monitored by quantitative immunoblot- 
ting using infrared detection (see Extended Data Fig. 9e-p). M059] or 
HEK 293T cells were transfected with four short interfering RNAs, result- 
ing in efficient CRBN knockdown (Fig. 4c) and increased MEIS2 steady- 
state levels in M059J cells (Fig. 4c) and HEK 293T cells (data not shown), 
implicating CRBN in MEIS2 turnover. 

To test whether lenalidomide treatment results in increased steady- 
state levels of MEIS2, M059J cells were treated with lenalidomide or a 
dimethyl sulphoxide (DMSO) control. Following lenalidomide treatment, 
endogenous MEIS2 protein levels were elevated by up to 2.5-fold (Fig. 4d 
and Extended Data Fig. 9i-k). A similar level of steady-state MEIS2 sta- 
bilization was observed with MLN4924 (Extended Data Fig. 91, lane 6), a 
NEDD8-activating enzyme inhibitor, and with the proteasome inhibitor 
bortezomib (Extended Data Fig. 9i, lane 5). Increases in MEIS2 protein 
levels following lenalidomide exposure were not due to mRNA changes, 
as determined by quantitative reverse transcription PCR (RT-PCR) 
(Extended Data Fig. 9m). 

When examining different IMiD derivatives, we found that thalid- 
omide, lenalidomide and pomalidomide stabilized steady-state MEIS2 
protein levels to a similar extent (Fig. 4d and Extended Data Fig. 9i, k). 
The effect of thalidomide on MEIS2 levels was also observed in vivo: 
whole zebrafish embryos 24 h post fertilization showed a 1.5-fold increase 
in MEIS2 levels, comparable to that observed in the cell lines (Extended 
Data Fig. 91). 


Thalidomide is an agonist and an antagonist 


We found that IMiD binding and MEIS2 recruitment were mutually 
exclusive in vitro and in vivo (Figs 4a and 5a—c and Extended Data Fig. 9b). 
Accordingly, a stable cell line overexpressing an epitope-tagged MEIS2 
together with the CRBN*W’“4 mutant (Extended Data Fig. 9f, g) exhibited 
no changes in MEIS2 levels following the addition of 40 LM lenalidomide. 
MEIS2 is thus an endogenous, IMiD-sensitive target of CRBN (see Sup- 
plementary Discussion for the putative role of MEIS2 in IMiD-mediated 
teratogenicity). Most CRLs studied so far target multiple substrates***. 
Given that IMiDs occupy the canonical ligand interface of the CRBN 
PUA domain, we also expect that other endogenous substrates are pre- 
vented from CRBN binding by thalidomide and its derivatives. By con- 
trast, IMiD-dependent targeting of the IKAROS transcription factors 
(Fig. 5d) is facilitated by specific solvent-exposed functionalities that 
are not involved in CRBN binding. We propose a model in which the 
interaction surface used for IKAROS family member binding comprises 
CRBN (in its IMiD-bound form) and the C4, C5 and C6 phthaloyl posi- 
tions of the IMiD. While structural studies on CRBN-IMiD-IKAROS 
complexes are necessary to shed light on the details of the binding mode, 
conceptually this mechanism bears striking similarities to auxin-induced 
degradation of members of the Aux/IAA repressor family by the ligase 
TIRI (ref. 35) and to cyclosporin-A-induced cyclophilin binding (or 
FK506-induced FKBP12 binding) to calcineurin*®’’. The CRL4 architecture 
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Figure 5 | Molecular model of IMiD function. a, Thalidomide binds to 
CRBN at the canonical substrate-binding site. b, The potent anti-myeloma drug 
thalidomide and its derivatives lenalidomide and pomalidomide occupy the 
same site but with different solvent-exposed moieties. c, Binding of the endogenous 
substrate MEIS2 and the IMiDs to this site is mutually exclusive. d, We propose 
a direct interaction of the C4 amine of lenalidomide or pomalidomide with 
IKAROS transcription factors (IKZF1 and IKZF3). The binding of these factors 
probably also involves the surrounding residues of CRBN. 


supports ubiquitination in its vicinity, a property that is exploited by 
viral proteins in recruiting cellular targets for degradation by CRL4s””. 
As small molecules can apparently mimic this behaviour, it will be impor- 
tant to explore whether synthetic small molecules can promote the deg- 
radation of other substrates that are not typically targeted by a specific 
ubiquitin ligase. 

Our structure—function analysis indicates that IMiD-mediated IKAROS 
transcription factor degradation simultaneously interferes with the recruit- 
ment of endogenous substrates (such as MEIS2) to CRLACREN, Depend- 
ing on the cell type and the proteins expressed, the administration of 
thalidomide and its derivatives will thus simultaneously affect the levels 
of two groups of proteins: upregulating the endogenous substrates while 
decreasing the amounts of neo-substrates. Thalidomide and its IMiD 
derivatives give rise to pleiotropic clinical effects, ranging from anti-cancer 
and immunomodulatory properties to pronounced teratogenicity. Both 
loss of ubiquitination, as seen for MEIS2, and gain of function, as seen 
for IKAROS family members, and even complex synthetic combinations 
of these opposing changes, need to be considered as underlying causes 
of these diverse clinical effects. 


METHODS SUMMARY 


All recombinant protein complexes were produced in insect cells**. The crystal 
structures of human DDB1 and G. gallus CRBN were determined by molecular 
replacement with a DDB1 search model and subsequent iterative model building 
for CRBN. In vitro ubiquitination assays were performed as previously described”. 
Protein-array-based ubiquitin ligase profiling was performed using ProtoArray 
v5.0 (Life Technologies) and analysed with the R software package. Cell-biological 
experiments were performed following standard procedures. Binding assays were 
based on time-resolved fluorescence resonance energy transfer (TR-FRET) or fluor- 
escence polarization methods. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Type Iax supernovae are stellar explosions that are spectroscopically 
similar to some type Ia supernovae at the time of maximum light 
emission, except with lower ejecta velocities’’. They are also dis- 
tinguished by lower luminosities. At late times, their spectroscopic 
properties diverge from those of other supernovae’ *, but their com- 
position (dominated by iron-group and intermediate-mass elements”) 
suggests a physical connection to normal type Ia supernovae. Super- 
novae of type Iax are not rare; they occur at a rate between 5 and 30 
per cent of the normal type Ia rate’. The leading models for type Iax 
supernovae are thermonuclear explosions of accreting carbon-oxygen 
white dwarfs that do not completely unbind the star* “, implying that 
they are ‘less successful’ versions of normal type Ia supernovae, where 
complete stellar disruption is observed. Here we report the detection 
of the luminous, blue progenitor system of the type Iax SN 2012Z in 
deep pre-explosion imaging. The progenitor system’s luminosity, col- 
ours, environment and similarity to the progenitor of the Galactic 
helium nova V445 Puppis''-”* suggest that SN 2012Z was the explo- 
sion of a white dwarf accreting material from a helium-star compan- 
ion. Observations over the next few years, after SN 2012Z has faded, 
will either confirm this hypothesis or perhaps show that this super- 
nova was actually the explosive death of a massive star'*"». 


SN 2012Z was discovered'’ in the Lick Observatory Supernova Search 
on 2012 January 29.15 uT. It had an optical spectrum similar to the type 
lax (previously called SN 2002cx-like) SN 2005hk** (see Extended Data 
Fig. 1). The similarities between type Iax and normal type Ia supernovae 
make understanding the progenitors of the former important, especially 
because no progenitor of the latter has been identified. Like core-collapse 
supernovae (but also slowly declining, luminous type Ia supernovae), 
type Iax supernovae are found preferentially in young, star-forming 
galaxies'”’*. A single type lax supernova, SN 2008ge, was in a relatively 
old (SO) galaxy with no indication of current star formation to deep 
limits'’. Non-detection of the progenitor of SN 2008ge in Hubble Space 
Telescope (HST) pre-explosion imaging restricts its initial mass to 
< 12 Me(where Mo is the solar mass), and combined with the lack 
of hydrogen or helium in the SN 2008ge spectrum, favours a white dwarf 
progenitor”. 

Deep observations of NGC 1309, the host galaxy of SN 2012Z, were 
obtained with HST in 2005-06 and 2010, serendipitously including the 
location of the supernova before its explosion. To pinpoint the position 
of SN 2012Z with high precision, we obtained follow-up HST data in 
2013. Colour-composite images made from these observations before 
and after the supernova are shown in Fig. 1, and photometry of stellar 


d_ HST WFC3 2013 


Figure 1 | HST colour images before and after supernova 2012Z. a, Hubble 
Heritage image of NGC 1309 (http://heritage.stsci.edu/2006/07); panels b and 
c zoom in on the progenitor system S1 in the deep, pre-explosion data. 


d, e, Shallower post-explosion images of SN 2012Z on the same scale as b and 
c, respectively. The source data for these images are available as Supplementary 
Information. 
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Figure 2 | Colour-magnitude diagrams of the SN 2012Z progenitor S1 
and comparison models. a, The F435W — F555W colour (roughly B — V), 
b, the F555W — F814W colour (roughly V — I), both plotted against the 
F555W (V) absolute magnitude. Black and brown crosses represent the 
progenitor systems for SN 2012Z and V445 Pup”, respectively, with lo 
photometric uncertainties. Other comparisons plotted include evolutionary 


sources in the pre-explosion images near the supernova location is 
reported in Extended Data Table 1. We detect a source, called S1, coin- 
cident with the supernova at a formal separation of 0.0082” + 0.0103” 
(equal to 1.3 = 1.6 pc at 33 Mpc, the distance to NGC 1309; refs 20, 21). 
The pre-explosion data reach a 30 limiting magnitude of My ~ —3.5 mag, 
quite deep for typical extragalactic supernova progenitor searches”, but 
certainly the possibility exists that the progenitor system of SN 2012Z 
was of lower luminosity and would be undetected in our data (as has 
been the case for all normal type Ia supernova progenitor searches to 
date**), However, the locations of SN 2012Z and S1 are identical to 
within 0.80, and we estimate only a 0.24% (2.1%) probability that a 
random position near SN 2012Z would be within 1¢ (30) ofany detected 
star, making a chance alignment unlikely (see Methods, and Extended 
Data Fig. 2). We also observe evidence for variability in $1 (plausible 
for a pre-supernova system; Extended Data Table 2), at a level exhibited 
by only 4% of objects of similar brightness. We thus conclude there is a 
high likelihood that S1 is the progenitor system of SN 2012Z. 

The colour-magnitude diagram (CMD) presented in Fig. 2 shows 
S1 to be luminous and blue, yet in an odd place on the diagram for a 
star about to explode. Ifits light is dominated bya single star, S1 is mod- 
erately consistent with an ~18.5 Mo main-sequence star4,an~11Mo 
blue supergiant early in its evolution off the main sequence, or perhaps 
an ~7.5 Mo (initial mass) blue supergiant later in its evolution (with 
core helium-burning in a blue loop, where models are quite sensitive 
to metallicity and rotation’’). None of these stars are expected to ex- 
plode in standard stellar evolution theory, particularly without any sig- 
nature of hydrogen in the supernova”. 

The SN 2012Z progenitor system S1 is ina similar region in the CMD 
to some Wolf-Rayet stars”: these are highly evolved, massive stars that 
are expected to undergo core collapse and may produce a supernova. If 
S1 werea single Wolf—Rayet star, its photometry is most consistent with 
the WN subtype and an initial mass of ~30-40 Mo; such Wolf-Rayet 
stars are thought perhaps to explode with a helium-dominated outer 
layer as a type Ib supernova”, and to be unlikely to produce the struc- 
ture and composition of ejecta seen in type Iax supernovae’**’. More- 
over, isochrones” fitted to the neighbouring stars (Extended Data Fig. 3) 


F555W — F814W (mag) 


tracks” for single stars (coloured dotted curves, with initial mass ranging from 7 
to 11. Mo as indicated), thermal models for Eddington-luminosity super-soft 
sources (SSS; purple dots), candidate Wolf-Rayet stars”° (WR; blue-grey 
stars), and models for helium-star donors to 1.2 Mq initial mass carbon- 
oxygen white dwarfs” (shaded blue regions). The effect of interstellar 
extinction with Ay = 0.5 mag is also shown (magenta arrow). 


yield an age range of ~ 10-42 Myr, longer than the 5-8 Myr lifetime of 
such a massive Wolf-Rayet star. 

S1 may be dominated by accretion luminosity; its brightness in B 
and V bands is not far from the predicted thermal emission of an 
Eddington-luminosity Chandrasekhar-mass white dwarf (a super-soft 
source, SSS in Fig. 2). However, its V— and V — H colours are too red 
for a SSS model. A composite scenario, with accretion power dominat- 
ing the blue flux, and another source providing the redder light (per- 
haps a fainter, red donor star) may be plausible. 

The leading models of type Iax supernovae*"® are based on explo- 
sions of carbon-oxygen white dwarfs, so S1 may be the companion star 
to an accreting white dwarf. Although there are a variety of potential 
progenitor systems (including main-sequence and red-giant donors, which 
are inconsistent with S1 if they dominate the system’s luminosity), in 
standard scenarios no companion star can have an initial mass greater 
than ~7 Mo; otherwise, there would not be enough time to form the 
primary carbon-oxygen white dwarf that explodes. Thus, the photom- 
etry of S1 suggests that if it is the companion to a carbon-oxygen white 
dwarf, recent binary mass transfer must have played a role in its evolu- 
tion. One model for a luminous, blue companion star is a relatively mas- 
sive (~2 Mo when observed) helium star’’”*”’, formed after binary 
mass transfer and a common envelope phase (for example, a close binary 
with initial masses of ~7 and ~4 M9). Although the model parameter 
space has not been fully explored, the predicted region in the CMD for 
helium star donors in a binary system with a 1.2 Mo initial-mass ac- 
creting carbon-oxygen white dwarf” is shown in Fig. 2, and S1 is con- 
sistent with being in this region. The evolutionary timescale for such a 
model is also well matched to the ages of nearby stars (Extended Data 
Fig. 3). 

SN 2012Z and the star S1 have an interesting analogue in our own 
Milky Way, namely, the helium nova V445 Puppis''’, thought to be 
the surface explosion of a near-Chandrasekhar-mass helium-accreting 
white dwarf. Though S1 is somewhat brighter than the pre-explosion 
observations of V445 Pup, their consistent colours, similar variability 
amplitude’, and the physical connection between V445 Pup and likely 


progenitors of type Iax supernovae*”” is highly suggestive. Indeed, two 


7 AUGUST 2014 | VOL 512 | NATURE | 55 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


type Iax supernovae (though not SN 20122Z itself) have shown evidence 
for helium in the system’””. In this model, a low helium accretion rate 
could lead to a helium nova (like V445 Pup), whereas a higher mass- 
transfer rate could result in stable helium burning on the carbon-oxygen 
white dwarf, allowing it to grow in mass before the supernova. The ac- 
cretion is expected to begin as the helium star starts to evolve and grow 
in radius; indeed, $1 photometry is consistent with the evolutionary 
track of a helium star with a mass (after losing its hydrogen envelope) 
of ~2 Mo, on its way to becoming a red giant". 

Although the scenario ofa helium-star donor to an exploding carbon- 
oxygen white dwarf is a promising model for the progenitor and super- 
nova observations, we cannot yet rule out the possibility that S1 is 
a single star that itself exploded. Fortunately, by late 2015, SN 2012Z 
will have faded below the brightness of S1, and HST imaging will allow 
us to distinguish between these models. Our favoured interpretation of 
S1 as the companion star predicts that it will still be detected (though 
perhaps modified by the impact of its exploding neighbour, a reduction 
in accretion luminosity, or a cessation of variability). On the other hand, 
ifS1 has completely disappeared, it will be a strong challenge to models 
of type Iax supernovae, and will perhaps blur the line between thermo- 
nuclear white-dwarf supernovae and massive-star core-collapse super- 
novae, with important impacts on our understanding of stellar evolution 
and chemical enrichment. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Tunable spin-spin interactions and entanglement of 
ions in separate potential wells 


A.C. Wilson’, Y. Colombe’, K. R. Brown’, E. Knill’, D. Leibfried! & D. J. Wineland! 


Quantum simulation’’— the use of one quantum system to simulate 
a less controllable one—may provide an understanding of the many 
quantum systems which cannot be modelled using classical compu- 
ters. Considerable progress in control and manipulation has been 
achieved for various quantum systems*°, but one of the remaining 
challenges is the implementation of scalable devices. In this regard, 
individual ions trapped in separate tunable potential wells are pro- 
mising® *. Here we implement the basic features of this approach 
and demonstrate deterministic tuning of the Coulomb interaction 
between two ions, independently controlling their local wells. The 
scheme is suitable for emulating a range of spin-spin interactions, 
but to characterize the performance of our set-up we select one that 
entangles the internal states of the two ions with a fidelity of 0.82(1) 
(the digit in parentheses shows the standard error of the mean). Exten- 
sion of this building block to a two-dimensional network, which is 
possible using ion-trap microfabrication processes’, may provide a 
new quantum simulator architecture with broad flexibility in design- 
ing and scaling the arrangement of ions and their mutual interac- 
tions. To perform useful quantum simulations, including those of 
condensed-matter phenomena such as the fractional quantum Hall 
effect, an array of tens of ions might be sufficient*'*"’. 

The use of effective spin-spin interactions between ions in separate 
potential wells is a key feature of proposals for simulation with two- 
dimensional systems of quantum spins with arbitrary conformations 
and versatile couplings®”””. In addition, these effective spin-spin inter- 
actions may enable logic operations to be performed in a multi-zone 
quantum information processor’** without the need to bring the quan- 
tum bits (qubits) into the same trapping potential well’®’”. Such cou- 
pling might also prove useful for metrology and sensing. For example, 
it could extend the capabilities of quantum-logic spectroscopy’*”’ to ions 
that cannot be trapped within the same potential as the measurement 
ion, such as oppositely charged ions or even antimatter particles’*. Cou- 
pling could be obtained either through mutually shared electrodes'*”° 
or directly through the Coulomb interaction’*”’. 

In the experiments described here, two ions of mass m are trapped at 
equilibrium distance dp in independent, approximately harmonic poten- 
tial wells. Coulomb interaction between the ions leads to dipole-dipole- 
type coupling, with strength 2... ocd * (Methods), where the oscillations 
of the ions in their respective wells manifest the dipoles’*. The coupled 
system has six normal modes, four perpendicular to the direction between 
the double wells (radial) and two along this direction (axial). Although 
all these modes are useful for dipole-dipole coupling’’, we concentrate 
on the two axial modes, with uncoupled well frequencies @, ~ w,, and 
with eigenfrequencies and eigenvectors 


Ost =O+ 4/5 +22. 
Wcom = @—1/ 0° +22. (1) 


str = (sin(Ostr), cos(Otr)) 
dcom = (sin(Ocom)s cos(Ocom)) 


where 0, = arctan (6 4/4 2%.) [20] and 0 com = arctan[(6 + 


0 +22) i Q.x]. The average well frequency is denoted by ®@= 


(@+@,)/2, and the frequency difference is 26 = (w,— @). For 
|d| >> Q. these modes decouple and the two ions move nearly inde- 
pendently of each other. When approaching resonance (6 = 0), the 
motions of the ions are strongly coupled, resulting in an avoided cross- 
ing of the motional frequencies with a splitting of 2Q.,. On resonance, 


the normal modes are a centre-of-mass mode (coms Ycom = (1 / V2, 


1/V2)) anda stretch mode (Wis Fstr = ( _ 1/V/2,1/V2) ), with motional 
quanta shared between the two ions. 

These shared quantized degrees of freedom can simulate spin-spin 
interactions’’’”’, just as for two-qubit quantum logic gates with ions in 
the same harmonic well; but, unlike in the latter case, the strength of 
the spin-spin interaction can be tuned from strong to weak by control- 
ling the individual trapping wells'*’*"”. We denote the energy eigenstates 
of the pseudo-spin-1/2 systems as {|T), ||)}, corresponding to internal 
states of the ions, separated by ig (h, Planck’s constant divided by 27), 
and the number states of the normal modes as |7¢:) and |"1com). We 
excite ‘carrier’ transitions ||, str Ncom) © |1> Mste Mcom) With a uniform 
oscillating field at the | |) © |) transition frequency wp, and with phase 
.. Simultaneously, a single ‘red-sideband’ excitation at frequency «wp — 
and phase ¢#,, between the sideband frequencies for the stretch and 
centre-of-mass modes, excites both the | |, st: Mcom) 2 |T> Aste — 1s Meom) 
transition and the ||, gu Mcom) © | Ts “st Ncom — 1) transition™”°. These 
excitations emulate an effective spin-spin interaction (Methods) 


Here = ho, “Gbe 
Gf), =cos(.) 44), —sin($.)a), 


where x is the coupling strength and ay ” are the Pauli spin-1/2 operators 


of the respective ions. We can emulate antiferromagnetic (« >0) and 
ferromagnetic (« < 0) interactions by our choice of the ion spacing or the 
detunings 6,,, and 6.9m of the normal modes relative to the sideband drive 
(Methods). Under the simultaneous carrier and red-sideband drive, the 
spins become periodically entangled and disentangled with the motion. 
Starting with a product state | ;), spins and motion are disentangled into 
a product state at T; = 27j/Qex (j > 0 integer), but the spins acquire phases 
that depend on the ions’ motion in phase space during the off-resonance 
excitation. These phases simulate the spin-spin interaction”®. We bench- 
mark our implementation of the spin-spin interaction by starting from 
the well-defined product state | Y%) = || |), effectively evolving it under 
an antiferromagnetic (« > 0) interaction for time T, = 1/4 with , = 0, 
and comparing the resulting state with the maximally entangled state 


oT cts 1 ; 
|.) =exp|—i7 3) ar Ii) = Fe (WW) A111) that would be pro- 
duced under ideal conditions (Methods). 


The (pseudo-)spin-1/2 system is formed by the |2s7S,/.,F = 1, 
mp = —1) =|f) and |2s*S;/., F = 2, mp = —2) =||) hyperfine ground 
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Figure 1 | Microfabricated surface-electrode trap. Microscope image of 
ion-trap electrodes, showing radio-frequency (RF) and static-potential control 
electrodes (C1-C12). Dark areas are the 5 um gaps between electrodes. Ions 
are trapped 40 jum above the chip surface; red dots indicate the ion locations, 
with a 30 um spacing. Electrode C1 also supports microwave currents at 

1.28 GHz to drive carrier transitions on the two ions. 


states of "Be*, where F is the total angular momentum and mr is the 
component of F along a quantization axis provided by a 1.46(2) mT 
static magnetic field (Fig. 1). The ions are confined in a cryogenic (trap 
temperature <5 K), microfabricated, surface-electrode linear Paul ion 
trap’ composed of 10 m-thick gold electrodes separated by 5 tm gaps, 
deposited onto a crystalline quartz substrate. An oscillating potential 
(~100 V peak at 163 MHz), applied to the radiofrequency electrodes in 
Fig. 1, provides pseudopotential confinement of the ions in the radial 
(perpendicular to z) directions at motional frequencies of ~17 and 
~27 MHzata distance of approximately 40 jim from the trap surface. 
Along the trap z axis, a double well is formed by static potentials applied 
to control electrodes C1-C12. The axial (z) oscillation frequencies « 
and «, around the respective minima are typically near 4 MHz. Single- 
ion heating’? is in the range of 100 to 200 quanta per second. This heat- 
ing is approximately four orders of magnitude larger than that due to 
our estimate of Johnson noise heating for this apparatus. For two ions 
spaced 30 jum apart, and in motional resonance (6 = 0), the period required 
for the ions to exchange their motional energies is T,,. = 11/22. = 70 Us, 
compared with an average period of 5-10 ms required to absorb a single 
motional quantum due to background heating. Fine adjustment of control- 
electrode potentials (at the 100 LV level) enables individual control of 
potential-well curvatures to tune the Coulomb interaction between the 
ions through resonance. Electrode C1 also supports microwave currents 
(typically of milliampere amplitude) that produce an oscillating mag- 
netic field to drive carrier transitions at the same rate in both ions. 
Superimposed o -polarized laser beams, nearly resonant with the 
2s 7S1/2—> 2p *P}/2 and the 2s 7S, > 2p *P3,2 transitions (A~313 nm) 
and propagating along the magnetic field direction, are used for optical 
pumping, Doppler laser cooling and state detection by resonance fluo- 
rescence. Optical pumping prepares both ions in | |). We can distinguish 
the ||) (bright) and ||) (dark) states by detecting resonance fluorescence 
on the | |) |2p *P3/., F = 3, mp = —3) optical cycling transition. Typi- 
cally, three to five photons are detected per ion in | |) over a background 
of 0.15 to 0.6 photons on a photomultiplier during detection periods in 
the range 300-400 ls. A pair of elliptically shaped laser beams, sepa- 
rated in frequency by approximately the | |) < |1) transition frequency 
(wp2n X 1.28 GHz) and detuned 80 GHz above the 7S, *P/. 
transition, illuminate both ions with equal intensity. These beams induce 
two-photon stimulated-Raman transitions for ground-state cooling” 
and for the motional sideband excitations used to implement the spin- 
spin interaction”’. Derived from the same 313 nm source, the frequency 
difference between the beams is produced with acousto-optic modu- 
lators, and the beam orientation is such that the difference wavevector 
k =k, — ky is parallel to the z axis (with magnitude k= 2/21 J A). The 
spin-spin coupling strength is « = cos(2#)(nQ,)*/2Q¢x, where 2¢ = kdp 
is the phase difference of the beat note between the two laser fields at 
the positions of the ions, Q, is the stimulated-Raman Rabi frequency 


and 7= k/, /h/2m@ (Methods). 
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Figure 2 | Motional spectroscopy of two coupled ions. a, The red dots 
connected by black lines indicate separate scans of the red-sideband detuning 
Opsp from the average mode frequency @ for different values of the difference 
6 between the individual well frequencies. The vertical scale is proportional 
to the sum of the probabilities for each ion to be in ||). At the centre of the 
avoided crossing, the normal mode frequency splitting Q.,/m is 12(1) kHz. Each 
data point represents an average of 200 experiments. Shaded planes are a 
theoretical prediction for the avoided crossing according to equations (1). 

b, Resonant (6 ~ 0) single-quantum motional exchange between two ions, with 
an exchange time T., = 80(2) us. The vertical scale is proportional to the 
probability of the laser-addressed ion being in | |). Each data point represents 
an average of 500 experiments, and error bars correspond to s.e.m. Dashed lines 
are included to guide the eye. 


A key to implementing spin-spin interactions with ions in separate 
trapping zones is being able to tune the well frequencies precisely enough 
to control the eigenfrequencies and eigenmodes (equations (1)) near the 
avoided crossing. In Fig. 2a, we characterize this avoided crossing. For 
these experiments, the ions are separated by 27(2) jtm. They are laser- 
cooled nearly to their motional ground states (mean motional mode 
occupation, str/com0-1), optically pumped to the || |) state and then 
rotated into the ||) state with a microwave carrier 1-pulse. Fine adjust- 
ments are made to control electrodes C2 and C12 to tune the harmonic 
confinement of the two trapping zones, stepping the system through 
the avoided crossing. At each step, after cooling and optical pumping, 
we implement the Raman red-sideband drive and scan its detuning 
Ogsp With respect to @. If the sideband excitation frequency is equal to 
0 — Wty OF My — Mcom then the spin of one or both ions can flip to | |) 
while absorbing quanta of motion, and a peak in the resonance fluores- 
cence counts is observed. The spectral resolution is set by the duration of 
the square-pulse sideband excitation (120 ts). At the centre of the avoided 
crossing, the splitting of the mode frequencies is 2Q.. = 2m X 12(1) kHz. 

In Fig. 2b, we show data that demonstrate single-phonon exchange 
between the two ions. With the trapping zones tuned to resonance (6 = 0), 
both modes are cooled to near the motional ground state and the ions 


©2014 Macmillan Publishers Limited. All rights reserved 


are prepared in |{7). In this experiment, the two Raman beams are 
tightly focused onto only one of the ions and are used to add a single 
phonon to that ion (and flip its spin) with a m-pulse on the red side- 
band of its local frequency in a duration short compared with t,,. In 
this limit, after the pulse, the resulting motional state is an equal super- 
position of both modes, and the phonon energy is therefore exchanged 
between the ions with a period 2T,, (ref. 16). To monitor the exchange, 
the same Raman interaction is applied again after a variable delay t. 
This can flip the spin and remove the quantum of motion only if the 
motion resides solely in the addressed ion after a particular delay. The 
level of fluorescence is proportional to the probability of this spin flip. 
From this, we determine an exchange time of T,, = 80(2) 1s, consis- 
tent with an ion spacing of 30(2) um for this experiment. The reduction 
in contrast for longer delays is caused mainly by fluctuations and drifts 
of the trapping potential. We estimate that 6/21 drifted by approximately 
500 Hz (a significant fraction of 02.,/2) during the 2-3 minutes re- 
quired for the 20,000 experiments that provided the data for Fig. 2b. 

For benchmarking the spin-spin interaction, the laser beams for fluo- 
rescence detection, Doppler cooling and stimulated Raman transitions 
are made to spatially overlap both ions with equal intensity. The ion 
spacing (approximately 27 tm here) is adjusted to an integer number 
of half-wavelengths of the difference wavevector of the two Raman 
laser fields, by a technique described elsewhere”, such that cos(2#) ~ 1. 
The wells are tuned to resonance (6 = 0) with adjustments to control 
electrodes C2 and C12. The ions are first Doppler-cooled, then Raman 
sideband-cooled to near the ground state on both normal modes, and 
finally optically pumped into the || |) state. The spin-spin interaction 
is implemented by simultaneously applying a relatively strong resonant 
microwave carrier excitation (Rabi frequency, 2. = 2m X 23.1(2) kHz) 
and an optical sideband excitation at wp) — @ (Rabi frequency, 7Q, = 
2n X 2.4(2) kHz). The exchange frequency satisfies 202. = 27 X 13(1) kHz, 
so that k = 2m X 446(13) Hz. In the middle of the coupling period, we 
shift the phases ¢, and ¢, of both driving fields by 180° relative to their 
phases during the first half of the coupling period. These phase rever- 
sals suppress the dependence of the final state on the carrier Rabi fre- 
quency and reduce sensitivity of the spin-spin interaction to drifts in 
the detuning and the coupling time (Methods). At the end of the cou- 
pling period, fluorescence detection and subsequent fitting of the photon- 
count histograms to those for the three possible outcomes (two ions 
bright, || |); one ion bright, || 1) or |{|); or both ions dark, |}7)) yield 
the respective probabilities P,, P, and Pp. 

Evolution of these probabilities as functions of the coupling dura- 
tion is shown in Fig. 3a. Near 300 1s, P) and Pp are approximately equal 
(P, + Pp = 0.91(2)) and P, has reached a minimum. To show that the 
resulting state is entangled, in a subsequent experiment we stop the evo- 
lution at 300 kus, apply a carrier /2-pulse of variable phase ,, and deter- 
mine the parity JJ = P, + Py — P, as a function of ¢,. These data are 
shown in Fig. 3b together with a fit to Acos(2, + $9) + B. The fitted 
probabilities and the contrast A = 0.73(2) imply a state fidelity’® F = 
(¥.|pe| Ze) = (P2 + Po + A) /2=0.82(1), where the density matrix p. 
describes the experimentally produced state (Methods). From simula- 
tions and independent measurements, we estimate the leading contri- 
butions to the observed infidelity as follows: drift and fluctuations of 
the trapping potentials (including ‘anomalous’ motional heating) con- 
tribute ~0.08; spontaneous emission due to off-resonance excitation 
by Raman laser beams contributes ~0.02; Raman laser beam intensity 
fluctuations contribute ~0.03; and state preparation and detection errors 
contribute ~0.03. 

For scalable implementations of lattices ofinteracting spins, the qua- 
lity and ease of tuning of the spin-spin interaction must be improved; 
however, there are no apparent fundamental barriers to this. Trap poten- 
tial fluctuations in our experiments appear to be dominated by changes 
in surface charging and work functions rather than changes in externally 
applied control potentials. It should be possible to suppress these fluc- 
tuations by improving the surface quality of the electrodes”, reducing 
the amount of nearby dielectric materials and minimizing the exposure 
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Figure 3 | Characterizing the spin-spin coupling interaction between ions 
in separate trapping zones. a, Evolution of probabilities Po of |} 1) (red), P; of 
|| 1) and |T |) (green), and P, of | ||) (blue), as functions of coupling duration. 
Each data point represents an average of 400 experiments, and error bars 
correspond to the s.e.m. b, Parity oscillation obtained (for a coupling duration 
of 300 ps) by applying a carrier analysis 1/2-pulse with variable phase ¢,, anda 
fit to the data (black curve). Each data point represents an average of 400 
experiments, and error bars correspond to the s.e.m. 


of the electrodes to ultraviolet light through better beam shaping. Laser 
intensity and pointing noise can be reduced by passive or active stabi- 
lization of the beams with respect to the ions (or both), or potentially 
avoided entirely by using microwave gradient fields for the sideband 
interactions’*. The microfabrication techniques used to construct the 
trap are scalable to larger arrays of trapped ions, thus potentially enabling 
informative ‘analogue’ quantum simulations* without requiring arbit- 
rarily precise quantum control. Theoretical work to quantify the common 
belief that many observables of interest in analogue quantum simula- 
tions are sufficiently robust is ongoing”? (Methods). Initial indications 
are that the proposed technical improvements may be sufficient. A three- 
by-three lattice is sufficient to simulate quantum Hall physics, and with 
six-by-six lattices fractional Hall effects and other intriguing solid-state 
phenomena become accessible*"’. Even for these modest numbers of 
spins, modelling of quantum interactions with conventional computers is 
challenging; this difficulty may be overcome with quantum simulations. 
Online Content Methods, along with any additional Extended Data display items 


and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper 
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METHODS 


Normal modes of the coupled wells. We consider two ions, cooled close to their 
motional ground states. Along the direction of separation, each ion is confined to a 
separate minimum of a double-well potential with minima denoted T (left) and ‘r’ 
(right). We assume much stronger confinement in the remaining directions, such 
that it is sufficient to consider only motion along the direction of the separated 
double well. The Hamiltonian of the motion of two ions of mass m and charge Q, 
spaced at an average distance dp in wells with local harmonic oscillator ladder ope- 
rators @ and a, and uncoupled oscillation frequencies «, and «,, including Coulomb 
coupling and neglecting constant energy terms, can be written for small motional 
excitation as'® 


Alm = heya + hea, i}, —hQxx( aj a +a a) 
with 
Q? 
Qex = FP ere i 
Anegm,/a@,d5 
We define @=(@+@,)/2 and 6=(@,—«@ )/2 and transform the motion into 


a normal-mode basis with eigenfrequencies and eigenvectors (expressed in the 
eigenmode basis of two uncoupled ions) 

= 52.1 92 

Ostr/com = ee o+ QQ. 


Qstr/com = (ice dh ssa) = (sin (9str/com) , cs (Ostr/com) ) 


b6F1/P 422. 


Qex 


where 


Ostr/com = arctan 


and the upper and lower signs apply to the stretch and centre-of-mass modes, 
respectively. In this basis, the motional Hamiltonian is 


fe at a ; xe os 
Ay = NOD strstr Aste F hedcom aon Acom 


where 4g; /com are the corresponding ladder operators in the coupled basis. For 6 = 0, 
we recover the familiar centre-of-mass and stretch modes with a mode splitting of 
2Q.x, and in the limit <Q, we can approximate 


de Ae wD Ae 
str /com v2 +50. V2 = 20. 


Interaction Hamiltonian. The two ions are driven resonantly by a spatially uni- 
form excitation on the carrier transition jt aie h je) = Gy, 11 H) at frequency 
@po; Rabi-frequency Q. and phase ¢,. In the interaction picture and rotating-wave 
approximation, the carrier interaction takes the form 


A. =hQ.[ (6 +6, je + (6 +65 Je] 


with a, = (4) . Simultaneously, the ions are driven close to the Raman red 


sidebands of both normal modes by two laser beams (quantities associated with 
which will be denoted using indices 1 and 2) with difference wavevector (k = ky — ky; 
magnitude k=2,/27//) aligned along the direction of the double well, having 
frequency difference Aw, = @2 — @ ~@ — @, and phase difference 26 = kdp for 
the beat note between the two laser fields at the positions of the ions. For 
Aa, = Mo, the carrier Rabi rate is Q,. We assume the Lamb-Dicke limit, where 


2 
(Ne Pee ies) Agty /com<1, with fig; /com the average occupation numbers and 


Nste/com =Ky/ ht / 2mostrjcom» the Lamb-Dicke parameters of the respective normal 


modes. The near-resonant terms of the red-sideband Hamiltonian are 
Frsp = a [eon 4 eon at eC" pe tee 

le ea Riese 

a Nth bac” soe tt) 


+ Neom4 oncom t €~ :omt— $0) 
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where dstr/com = AM — Wo + Mstr/com is the detuning relative to the red sideband 
of the respective normal mode, and ¢, is the phase of the sideband excitation at the 
mean position of the ions. 
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Spin-spin interaction. In the limit of a strongly driven carrier, such that |Q.|>> 


Mste/com \/ Mstr/com$2s| >| Ostr/com| {> it is helpful to first transform to an internal- 


state basis where the bare spin states are dressed by the carrier**”’. In this dressed 
frame, the basis states {|+1/,), |—1:)} are eigenstates of 


Gi—) = COS($.)G%i/x) — sin($.) Fy, 


1 5 
with | tyr) = a (tye) te“* |Lye)) and of tir) =e | tyr). For each of 


the four internal basis states |+,)|+,) and each normal mode, the sideband inter- 
action can be written (neglecting rapidly oscillating terms near 2Q.) 


r) : iScomt AT % —iScomt x 
Hg=ih (deome ont am —Feome ™ Acom) 


+h Sect ~T 5 — i surt x 
+ih (dave! sty — d€ we ase) 
where the coefficients d.t:/com are state-dependent coherent displacement rates 
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Q, F = 
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with s1/,€{ —1,1} the eigenvalues corresponding to the basis states in question. 
The integrated displacements dstr/com and the geometric phases Petr/com acquired 
after time ¢ are*! 


d. S15 s 
Aste /com (S1sSrst) = jerioom( SS) vs) (1 —e'Pas/somt) 


Oste/com 
Pyer/com(SSrst) = 
dy (s,5-)|" . = (3) 
| — . | (Sstr/comt sin(Sstr/comt) ) 


str/com 


To return the motions of both modes to the original state after an interaction 
duration T, we require %¢t1/com(Sp Sy I) = 0. This happens irrespective of the (state- 
dependent) magnitude of dstr/com if Ostr/comT = Cstricom(27) With Cstr/com an integer. 
In such cases, the motion is displaced around |C,t:/com| full circles in the respective 
phase spaces of the two modes by the interaction. Also, because dst — com = 


2,/6° +Q2., the interaction duration can assume only certain values, determined 
by Ac = str — Ccom > 0, for the motion to return to its original state: 


mAc 


/°+Q2, 


If the spin and motional states are in a product state initially, they will be in a pro- 
duct state at T and any integer multiple of T. The spin-dependent phases acquired 
during T simplify to 


T= 


2 
Ng Tr com 2s 
Px/com(SI Sr) _ (*) 


x — [1 +515; cos(2#)sin(20str/com) | 
str/com 

The spin-dependent term is largest if ¢ = jx/2 with j integer. This corresponds to 
the ions being spaced by an integer number of half-wavelengths m/k. In the experi- 
ment, the separation of the ions is controlled by slight changes in the well curva- 
tures to ensure half-integer wavelength spacing. Also, |sin(20s1:/com)| is reduced for 
|o| > 0 and eventually vanishes as the modes decouple in the limit |6|>>Q.; 
therefore, the most efficient spin-spin interactions are implemented for 6 = 0. 
For our experimental conditions and 6 = 0, the mode splitting is much smaller 
than the average mode frequency @, so we can approximate 1str /com 1 = ky/hi/2me. 
If we also assume that 6<Q.x, the phases simplify to 


Por /com(S1Sr>T) = 


2 io 
Qs T = 6 
( 2 ) fic E cae) (1 ~ io)| 


In this limit, the phases ®st1/com(S1 5 T) depend only to second order on the 
relative detuning of the two wells. The shortest loop duration T is realized for 
Ac = 1, but the phase accumulates most effectively when the sideband drive is 
tuned to ©, exactly halfway between the normal modes (Cstr/com = £1, Ac = 2). At 
this detuning, the logical phase acquired on both modes adds constructively, and 
there is always some degree of phase cancellation for all other possible settings of 
the detuning. The total phase accumulated on both modes during T is 
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For any integer multiple of T, we can summarize the action of the applied fields as 


|+1,+,jT) = 


exp| —icos(2) se 


a af a¢jT| | +1, 1,0) 


with j a positive integer. Because this holds for a complete set of spin-basis states, it 
also holds for any general initial state of the system. Therefore, at any multiple of 
T, the system evolution is equivalent to that under the spin-spin Hamiltonian 


Here = hye: ate (4) 
(Q)” 

= 2 5 

x=cos(2g) M2 (5) 


A change from ferromagnetic to anti-ferromagnetic interaction can be accomp- 
lished by a 1/k change in the ion spacing, corresponding to a m/2 change in ¢. 
Alternatively, for example, «' = —x/3 <0 is realized with a choice of detuning 
such that (Csr = —1, Coom = —3)- 

In principle, we can either perform a ‘stroboscopic’ emulation with the total 
duration a multiple of T, or use detunings d,tr/com Whose magnitudes are much 
larger, so that all ja; |<1 for any given time. For all multiples of T, the motional 
states of the ions factor from the spin states, so if one only ‘looks’ stroboscopically 
at times jT, the system effectively appears as though only the spins have evolved 
according to equations (4) and (5), while the motion has returned to its original 
state, thus appearing to have been unaffected. For much larger magnitude detun- 
ings dgtr/com Spin—-motion entanglement, and, thus, the deviation of the simulated 
state from that under the ideal spin-spin interaction, is small for arbitrary dura- 
tions of the interaction”®. The added robustness comes at the expense of a weaker 
spin-spin interaction, which has to be compensated for by higher drive power or 
longer simulation timescales. Finally, rather than suppressing the bosonic harmonic 
oscillator modes, we can include them as an integral part of the simulator and study 
collective spin-boson Hamiltonians, which have been recently shown to contain 
complex behaviour comparable to models with only spin-spin interactions”. 
Experimental characterization. We benchmark the spin-spin Hamiltonian of 
equations (4) and (5) by using it to entangle the hyperfine states (pseudo-spins) of 
two ions starting from the initial state | | |). To gain isolation from small errors, we 
break the total spin-spin interaction into two loops in phase space with x = 11/8 for 
each loop. For the first loop, we can choose ¢, = 0 and ¢, = 0 so that the eigenstates 
in the dressed basis are those of G{,,. After finishing the first loop, we change carrier 
and sideband phases to ¢. = ¢, = 1. The change in carrier phase is such that at the 
end of the second loop, the rotating frame due to the carrier is re-aligned with the 
frame of the bare states. This is because rotations around the x axis of the Bloch 
sphere in the first loop are unwound by rotating around the —x axis for the same 
duration in the second loop. In addition, the phase change in the sideband drive 
ensures that dstr/com(Sp Sr) Of the first loop is followed by —dgtr/com(Sp S,) in the 
second loop. In total there are three sign changes in the displacement rate equa- 
tion (2), the first from replacing 67, by 6," = — 6}/, and therefore s,,,— —5),, the 
second due to ¢,=0—> 1 and the third due to ¢, = 0— 1, which multiply to 
change the sign of the displacement rate. As a consequence, the total displacement 
Astr/com(Sp Sp I) in the second loop (equation (3)) is equal and opposite to that in 
the first loop and the motional wavefunctions return to their original positions in 
phase space even if &st1/com(Sb Sr I) # 0 due to small errors in the detunings d.t-/com 
or in loop duration, provided that those errors are constant over both loops**. The 
phases D.-/com(Sp S,) depend only on |dgtr/com(Sp sl, and the effective spin-spin 
evolution is therefore the same in both loops. With the sideband excitation tuned 
to @, a single loop duration corresponds to T, = 27/Q,, for a total interaction 
duration of 2T;. Starting from the initial state || |), we would ideally produce the 


maximally entangled state | ¥.) = exp [- iT afer] |Yi:) = (ill 1) —|t1)), ifthe 


v2 

sideband Rabi frequency satisfies 7Q; =Qex / 2/2. 
Determination of probabilities from state-dependent fluorescence. During 
one detection period (300-400 us) we typically detect between 0.15 and 0.6 counts 
if both ions are projected into ||), and 3 to 5 additional counts for each ion in state 
| |). For each experimental setting, we record count histograms for 200-500 experiments. 

Consider a count histogram h = (h(i));, where h(i) experiments yielded i counts 
and N = 9°; h(i) is the total number of recorded counts. We infer the probabilities 
P, with b = 0, 1,2 by applying probability estimators w, = (w,(i)); to h according 
to Pp =>; wo(i)h(i) /N. The probability estimators are determined from the recorded 


photon counts for on-resonance microwave Ramsey experiments with two ions, 
where the phase ¢ of the second /2-pulse was varied. These experiments are per- 
formed before and after the experiments to be analysed. An ideal such Ramsey 
experiment satisfies 


Po($) =cos4($/2) 
P,(#) =sin®(9)/2 
P,() =sin*(#/2) 


The histograms hy recorded at phase # are sampled from the mixture Pogo + 
Pq, + P2q2, where the q, are the count distributions for zero, one or two bright 
ions. From this model and the Ramsey data, we can determine w, so that 5°; w, 
(i)hg (i) yields P,(p). We use a linear least-squares fit, regularizing it to minimize 
the anticipated variance when inferring P, for the completely mixed state. 

Given a probability estimator w and a recorded histogram h, we estimate the 
experimental variance of the inferred probability P according to v= (37; w(i)? 
h(i) /N —P?)/(N —1). This variance determines the error bars in Fig. 3. For the 
fidelities and related quantities, the variation in the probability estimators due to 
the finite statistics of the Ramsey experiments contributes an error comparable to 
this variance. To determine the overall statistical error in the fidelities, we used non- 
parametric bootstrap resampling™ on all contributing histograms with 100 boot- 
strap resamples to determine error bars for fidelities and contrasts. 

The assumed model for the Ramsey experiments makes no assumptions about 
the shapes or relationships of the count distributions q,. This was important because 
we found that the q, exhibit clear deviations from Poissonian distributions. We also 
determined cy, the mean number of counts according to qy, and found that cz — co 
exceeded 2(c; — cg) by about 8% for all the Ramsey scans considered. 

Several effects result in deviations from an ideal Ramsey experiment. We found 
that there is a phase offset of approximately 5° in the Ramsey scans. We shifted the 
phase accordingly before determining the probability estimators. This had a stat- 
istically negligible effect on inferred probabilities and fidelities. After adjusting for 
the phase shift, we found no signature of a mismatch between the model and the 
data. In addition to checking that the dependence of the histograms on the phase 
was as expected, we considered whether there are more than three count distribu- 
tions contributing to the Ramsey scans. We found no signature of such an effect. 
Furthermore, all other histograms, including those used to determine fidelities, 
could be explained as arising from a mixture of the same three count distributions. 

An important effect that need not be apparent from the data is state-preparation 

error. By simulating Ramsey experiments with state-preparation error and q, as 
inferred from the data, we determined that such errors lead to systematic overesti- 
mates of fidelities that are well correlated with the state-preparation error. The 
simulations involved initial states that are mixtures of the basis states. Let e (<1) 
be the probability that the state in this mixture is not | | |). For the inferred fidelities, 
we estimate a systematic increase in fidelity of approximately 1.1e. The quoted 
systematic errors are based on a pessimistic upper bound of 0.01 on «. In inferring 
P, for a single histogram (as required for the plots in Fig. 3), these biases are small 
compared with the statistical error and were therefore not included in the error 
bars. We assumed that pulse errors had a statistically small effect on inferred 
probabilities and fidelities. 
Discussion on robustness of analogue simulations. Richard Feynman stated 
that, “with a suitable class of quantum machines you could imitate any quantum 
system, including the physical world”’. For arbitrarily precise quantum simula- 
tions, this requires scalable quantum computers that employ error correction, but 
realizing these computers has proven to be very difficult. An alternative that may 
circumvent the difficulties is to faithfully map the dynamics of the physical model 
of interest onto sufficiently controllable quantum systems. This is called ‘analogue 
quantum simulation’. Because the overall physical properties of interest are often 
determined by local observables, the expectation is that the full quantum state need 
not be arbitrarily precise for useful information to be obtained*’. For example, 
although the global many-body state of the simulator is sensitive to a local per- 
turbation, the expectation values of intensive properties can be more robust”. It is 
also noteworthy that many material properties are robust in the presence of nat- 
urally occurring imperfections. This suggests that a useful analogue quantum simu- 
lator might be significantly easier to construct than a quantum computer, even in 
the absence of sufficiently precise quantum gates or explicit quantum error-correction 
strategies needed for fault tolerance”. 

Although the robustness of analogue quantum simulations is frequently asserted, 
it is not a simple matter to quantify the effects of experimental imperfections on 
physical properties of interest. At present, there does not exist a perfect and rigorous 
way to assess the quality of the results that one can expect from an analogue quan- 
tum simulation”®. Nevertheless, one can seek models and conditions for which the 
effects of the quantum simulator’s imperfections are expected to be minor and well 
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understood. A number of experimental groups, across multiple platforms, are cur- 
rently pursuing this strategy. An alternative is to seek validation of the results on 
small systems that can be classically verified before obtaining results on large sys- 
tems realizing the same model. In addition, validation may come from consistent 
results on multiple independent simulator platforms. This can eliminate simulator 
artefacts, as has been suggested in ref. 37. 

For many developers of quantum simulators, a common Hamiltonian for test- 
ing their setups is the transverse Ising model**'°”*”*”°, Recently, a theoretical inves- 
tigation into the influence of disorder on the fidelity of quantum simulations of the 
Ising model was performed”. With relatively large spin chains, analogue quantum 
simulator results are predicted to be usefully robust to random variation in the 
coupling coefficient up to a few per cent. This high tolerance to coupling imperfec- 
tions, relative to a comparable universal quantum computation, is achieved because 
the simulation required that only local observables, rather than the entire simulator 
state, be robust. Although this work does not account for other technical issues that 
often limit the performance of experiments, it is nonetheless a useful performance 
indicator. In relation to our work, it suggests that although further progress on 
reducing experimental imperfections is probably required, the future technical 
improvements we propose may be sufficient. It may also be possible to ensure that 
the experimental imperfections correspond to physically relevant effects in the 
model under consideration. For example, Lloyd suggested that, “decoherence and 
thermal effects in the quantum computer can be exploited to mimic decoherence and 
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thermal effects in the system to be simulated”, as was recently demonstrated**. To 
ensure that the platform’s imperfections represent physically relevant interactions 
between the model and its normal environment, one can sometimes engineer the 
mapping from the ideal model to the experimental platform’’. Although we cannot 
make a general statement on the robustness of analogue quantum simulations, the 
above discussion is suggestive and many promising examples have been proposed. 
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carbon nanotubes 
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Over the past two decades, single-walled carbon nanotubes (SWCNTs) 
have received much attention because their extraordinary properties 
are promising for numerous applications'”. Many of these properties 
depend sensitively on SWCNT structure, which is characterized by 
the chiral index (n,m) that denotes the length and orientation of the 
circumferential vector in the hexagonal carbon lattice. Electronic prop- 
erties are particularly strongly affected, with subtle structural changes 
switching tubes from metallic to semiconducting with various band- 
gaps. Monodisperse ‘single-chirality’ (that is, with a single (n,m) 
index) SWCNTs are thus needed to fully exploit their technological 
potential’*. Controlled synthesis through catalyst engineering’ ~*, end- 
cap engineering’ or cloning strategies*”, and also tube sorting based 
on chromatography’’"’, density-gradient centrifugation, electro- 
phoresis and other techniques’, have delivered SWCNT samples with 
narrow distributions of tube diameter and a large fraction of a pre- 
determined tube type. But an effective pathway to truly mono- 
disperse SWCNTs remains elusive. The use of template molecules 
to unambiguously dictate the diameter and chirality of the resulting 
nanotube*"** holds great promise in this regard, but has hitherto 
had only limited practical success”’”"*. Here we show that this bottom- 
up strategy can produce targeted nanotubes: we convert molecular 
precursors into ultrashort singly capped (6,6) ‘armchair’ nanotube 
seeds using surface-catalysed cyclodehydrogenation on a platinum 
(111) surface, and then elongate these during a subsequent growth 
phase to produce single-chirality and essentially defect-free SWCNTs 
with lengths up to a few hundred nanometres. We expect that our on- 
surface synthesis approach will provide a route to nanotube-based 
materials with highly optimized properties for applications such as 
light detectors, photovoltaics, field-effect transistors and sensors’. 

Recent work has produced non-planar carbon-based nanostructures 
such as fullerenes, carbon pyramids, and buckybowls from their cor- 
responding quasi-planar polycyclic aromatic hydrocarbon precursors 
through surface-catalysed cyclodehydrogenation (CDH)’?*’. We have 
extended the methodology to the synthesis of ultrashort singly capped 
SWCNTSs, that is, a SWCNT end cap with a short tube segment attached. 
Such molecules represent ideal seeds for subsequent epitaxial elongation 
into isomerically pure SWCNTs. Formally, this approach mimics the 
conventional synthesis of SWCNTs by the root-growth mechanism, in 
which nanotube growth starts by nucleation of an end-cap fragment on 
a metal nanoparticle”. The key point is to avoid uncontrolled, sponta- 
neous nucleation of end caps by providing atomically precise ultrashort 
nanotube seeds which unambiguously dictate the chiral index of SWCNTs 
forming on epitaxial elongation. 

Precursor P1 (CogHs4; Fig. 1) was designed and synthesized by multi- 
step organic synthesis to tackle this challenge (for details, see Methods). 
Upon intramolecular CDH it affords seed $1, an ultra-short singly capped 
(6,6) SWCNT bearing a carbon nanotube segment. The selective growth 
of (6,6) SWCNTs is illustrated in Fig. 1 and combines two steps: (1) 


formation of seed $1, and (2) subsequent epitaxial elongation. The first 
step is realized by depositing precursor P1 ona Pt(111) surface followed 
by annealing to 770 K under ultrahigh vacuum conditions to induce the 
surface-catalysed CDH reaction (Fig. 2a, b). The second step, epitaxial 
elongation, is achieved by the incorporation of carbon atoms origin- 
ating from the surface-catalysed decomposition of a carbon feedstock 
gas (Fig. 3a-c). 

Figure 2c shows a scanning tunnelling microscopy (STM) image 
acquired after depositing P1 on Pt(111). No step decoration or island 
formation is observed. Interactions with surrounding molecules and 
step edges are thus largely suppressed and ensure subsequent unperturbed 
CDH ofthe precursor'*”’. For the majority of the as-deposited precursors, 
STM reveals a three-fold symmetric conformation. However, the intrinsic 


Precursor 
CoygH,4 (P1) 


CDH 


(6,6) SWCNT 
seed (S1) Singly capped 


_ (6,6) SWCNT 


Figure 1 | Two-step bottom-up synthesis of SWCNTs. (1) Formation of 
singly capped ultrashort (6,6) SWCNT seed S1 via cyclodehydrogenation 
(CDH) of the suitably designed polycyclic hydrocarbon precursor CogHs4 
(P1). (2) Nanotube growth via epitaxial elongation (EE) . Parts of the precursor 
P1 involved in the formation of the SWCNT end cap and the ultrashort CNT 
segment of the seed S1 are highlighted in orange and blue, respectively. Red 
dashed lines indicate the new C-C bonds formed upon CDH. Epitaxial 
elongation occurs via the successive addition of carbon species, as indicated 
in green. 
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Figure 2 | Formation of (6,6) SWCNT seeds S1. a, b, Illustration of the 
thermally induced surface catalysed CDH to form the (6,6) SWCNT seed 
$1 from the adsorbed precursor P1. c, d, STM images of precursor molecules as 
deposited on Pt(111) (c) and after annealing to 770 K (d). e, Close-up STM 
image of a precursor (top) and the corresponding simulation based on the 
extended LUMO (greyscale) with a structural model of the molecule 
superimposed (bottom). f, Line profiles (positions indicated in c, d) over an 
as-deposited precursor P1 (grey line) and the seed species $1 obtained after 
annealing to 770 K (black line). g, Close-up STM images taken at —1 V and 
0.1 V of (6,6) SWCNT seed S1 (left) and the corresponding simulations of 
HOMO and LUMO, respectively (greyscale, right). 


axial chirality of the benzo[c]phenanthrene moieties ([4]helicene) (see 
Extended Data Fig. 1) and the configurational flexibility of the peripheral 
biphenyl moieties produce a large variety of possible geometries (Fig. 2c, 
Extended Data Fig. 1). The presence of axial chirality in the [4]helicene 
moieties results in four possible stereoisomers and thus eight possible 
adsorption geometries (Extended Data Fig. 1). An example of a mole- 
cule with its three outer biphenyl groups located closest to the surface 
is shown in Fig. 2e, together with the corresponding STM simulation 
based on the extended lowest unoccupied molecular orbital (LUMO; 
see Methods). The excellent agreement between STM image and sim- 
ulation (Extended Data Fig. 1) indicates that the different topographic 
features observed for the adsorbed precursors can be attributed to the 
different adsorption geometries. Importantly, the stereoisomerism does 
not affect the CDH process, since all chiral centres will disappear during 
intramolecular cyclization. 

Although P1 is designed to yield seed $1, the conformational flexibility 
of the peripheral biphenyl groups leads partially to undesired adsorp- 
tion geometries. In contrast to the stereoisomers discussed above, these 
molecules will follow a different CDH pathway, ending in the formation 
of undesired buckybowls (Extended Data Fig. 2). A statistical analysis 
of more than 100 precursor monomers observed by STM revealed that 
more than 50% adopt the desired configurations (Extended Data Fig. 1). 
Most importantly, the condensation products of precursor molecules 
exhibiting ‘wrong’ conformations cannot act as seeds for the subsequent 
CNT growth process via epitaxial elongation, and thus will not affect 
the selectivity of SWCNT formation. 
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Figure 3 | Epitaxial elongation of singly capped SWCNT with (6,6) chiral 
index defined by the seed $1. a—c, Schematic illustration of the epitaxial 
elongation of (6,6) SWCNT seeds S1 via surface-catalysed C, incorporation at 
the nanotube/metal interface. The use of ethanol as a carbon feedstock gas is 
illustrated, which decomposes on the hot, reactive Pt surface into C species and 
binds ‘epitaxially’ to the bay region of the previously formed singly capped 
SWCNT, resulting in an elongation of the tube along its axis. d-f, STM images 
of the as-prepared (6,6) SWCNT seeds (d), and after exposure to low doses 
of ethylene of 1 L (e) and 5 L (f), respectively, at a temperature of 770 K. For 
direct comparison, the same height colour scale is used for d-f. g, Line profiles 
taken across the features indicated by arrow pairs in the corresponding 

STM images d-f. h, i, STM images of a sample exposed to a pressure of 

1X 10 7 mbar of ethanol for 1h (270L) ata temperature of 770K. A long 
SWCNT is observed to lie on the rough surface (h). A close-up STM image 
(i) identifies it as a (6,6) SWCNT. 


Surface-catalysed CDH of precursors (P1) into seeds ($1) is induced 
by annealing at 770 K for 10 min. STM images (Fig. 2d) show that the 
originally quasi-planar three-fold symmetric molecules transform into 
dome-shaped species with a prominent increase in apparent height from 
2 to 4.5 A (Fig. 2f). Additional proof of successful dehydrogenation of P1 
into S1 derives from the good agreement of high-resolution STM images 
and simulations of the frontier molecular orbitals of $1 (Fig. 2g). Both 
results demonstrate the successful formation of the targeted singly capped 
ultrashort (6,6) SWCNT S1. 

In the second step, the surface-anchored seeds $1 are grown into 
(6,6) SWCNTs via epitaxial elongation by exposing them to a carbon 
feedstock gas such as ethylene or ethanol at temperatures between 670 
and 770 K (see schematic illustration in Fig. 3a—c). The pre-synthesized 
seeds $1 are extended by catalytic epitaxial elongation, which consists 
of a consecutive incorporation of carbon atoms (schematically shown as 
C,) originating from the decomposition of ethanol or ethylene. The open 
part of $1 is already in the required contact with the Pt(111) surface, 
which catalyses further epitaxial elongation. In order to unambiguously 
demonstrate the activity of $1 in an epitaxial elongation process, the 
results of low exposures to carbon feedstock gas were followed in situ by 
STM. Low doses of ethylene at 770 K produce a substantial increase in 
apparent height from 4 to 20 A (Fig. 3d-g). After exposure to 1 Langmuir 
(L) of ethylene, about 18% of the initially deposited precursors P1 have 
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grown into SWCNTs, a density that remains constant for higher expo- 
sures of 5 L. Exposure to yet higher doses of ethylene or ethanol produces 
strong changes in the surface topography and makes STM imaging 
increasingly difficult (Fig. 3h). The topography becomes rough and the 
Pt surface is no longer discernible. Careful examination of the STM images 
reveals the presence of one-dimensional structures lying across the surface 
(Fig. 3h), with an abundance of about 5 per pm and observed lengths 
exceeding 200 nm. High resolution images of the elongated structures 
(Fig. 3i) reveal an internal structure of higher contrast lines along the 
direction of the tube axis. A superimposed structural model clearly shows 
that these lines are indeed the carbon positions of the graphene structure 
ina (6,6) SWCNT. In all cases where atomic resolution of the tubes could 
be achieved, the structure proved to be consistent with a (6,6) SWCNT. 
The outstanding agreement of both the orientation and the periodicity 
of the graphene lattice with those expected demonstrate that the one- 
dimensional structures are the targeted (6,6) SWCNTs, bent horizon- 
tally across the sample surface. 

To further corroborate the density and length of the horizontally aligned 
SWCNTs, the above sample was imaged with a scanning helium ion 
microscope (SHIM). Tubes longer than 300 nm and with diameters below 
2 nm (the resolution limit) could be observed (Extended Data Fig. 3), 
with a density of 3-4 tubes per jum? that is similar to the density esti- 
mated from STM images (5 per m7). We note that both STM and 
SHIM can only image horizontally aligned SWCNTs, but do not give 
access to vertically aligned SWCNTs. 

To shed light on the orientation of the SWCNTs and to characterize 
the selectivity of the growth process, Raman characterization was per- 
formed. Measurements with the illuminating laser beam at normal inci- 
dence (perpendicular to the surface plane) yielded extremely low intensities; 
the G band, the most intense feature at around 1,590 cm _/, is very weak 
(grey curve in Fig. 4a). However, when samples were measured under 
an illumination angle of 30°, all bands increased significantly in intensity 
(black curve in Fig. 4a), as expected for a dominant fraction of SWCNTs 
oriented perpendicular to the surface. The polarizability of these vertically 
oriented tubes with the illuminating laser beam at normal incidence— 
and thus the electromagnetic field vector perpendicular to the tube axis—is 
drastically reduced, resulting in weak Raman intensities”. A predomi- 
nantly vertical alignment of the SWCNTs is consistent with the obser- 
vation that in SHIM images some CNTs appear to shake under the ion 
beam (Extended Data Fig. 3), and with the rough surface seen in STM. 
It also implies that the overall SWCNT density is expected to be signi- 
ficantly higher than that estimated from the STM and SHIM data. 

More importantly, the Raman spectra demonstrate the high selectivity 
of our growth process, which produces (6,6) SWCNTs exclusively. The 
spectrum shown in Fig. 4b presents clearly defined bands at the positions 
expected for (6,6) SWCNTs. The band at 295 cm ~ 1 is associated with the 
radial breathing mode (RBM) whose frequency depends strongly on the 
nanotube diameter’’. Empirical equations predict an RBM frequency of 
280-295 cm” for (6,6) SWCNTs”’. Experimentally, the RBM for (6,6) 
SWCNTs deposited on SiO, has been reported at around 289 cm! 
(ref. 24), but it is well known that the substrate plays a crucial role in 
the RBM position, explaining the small deviation observed”*. Most impor- 
tantly, our Raman spectra do not show any further bands within the RBM 
range (200-400 cm _'), which underlines the extremely high selectivity 
of the process. Another notable characteristic of the RBM is the excep- 
tionally small width observed (the full-width at half-maximum, FWHM, 
is3.5cm |, limited by the resolution of the instrument), which is as small 
as that from isolated individual small diameter SWCNTs (an FWHM 
of 3cm_')*°. The Raman spectra reported here average over a large area 
(beam diameter 1.5-10 jum (inset in Fig. 4b)), and thus reflect the prop- 
erties of a large number of SWCNTs. The narrow RBM peak therefore 
demonstrates a very high degree of monodispersity. 

The G band appears as a double peak at 1,518 and 1,591 cm’. The 
significant curvature in small diameter SWCNTs causes a shift to lower 
frequencies of the optical vibrations associated with transverse (perpen- 
dicular to the tube axis) atomic displacements (the G band)”’. Although, 
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Figure 4| SWCNT orientation determination and single chirality 
assessment by Raman spectroscopy. a, Raman spectra of epitaxially elongated 
(6,6) SWCNTs obtained by exposing seeds $1 to a pressure of 1 X 10’ mbar 
of ethanol for 30 min (140 L) at a temperature of 670 K. The spectra were 
acquired with the sample surface perpendicular (grey curve) and at an angle of 
30° (black curve) to the laser beam. b, Raman spectrum of longer SWCNTs 
(1h at 1X10 7 mbar of ethanol at 670 K; 270L) for a short laser illumination 
time (30 s), revealing defect-free SWCNTs as judged by the absence of a D peak. 
The insets show further details on the very narrow RBM at 295 cm * as 
measured with a beam diameter of 1.5 (filled circles) and 10 um (open circles), 
respectively, and the splitting of the G band into a G* component at 1,591 anda 
G component at 1,518 cm |, which is characteristic of (6,6) SWCNTs. 


to our knowledge, this splitting has not been observed experimentally 
for a (6,6) SWCNT, the G* andG” band splitting (Awg) has been pre- 
dicted to be 83cm’ (ref. 28), in good agreement with the splitting that 
we observe (Ag = 73cm '). The additional peaks in the range from 400 
to 1,200 cm‘ have previously been observed and used as evidence for 
the presence of armchair SWCNTs, since no peak in this range is pres- 
ent for semiconducting tubes”. Finally, the absence of any D band in 
the Raman spectra (Fig. 4b) further underlines the extreme cleanliness 
of our process that yields not only predefined single-chirality but also 
essentially defect-free SWCNTs. 

These findings clearly illustrate that the use of a planar metal surface 
instead of metal nanoparticles effectively supresses spontaneous cap for- 
mation and that this, in combination with the use ofa suitable precursor 
species to produce a desired cap, enables highly selective SWCNT fab- 
rication. The entire in situ process and the low temperatures involved 
in both the CDH step (770 K) and the subsequent epitaxial elongation 
(670 K) are fully compatible with complementary metal oxide semicon- 
ductor (CMOS) technology, and our methods might therefore solve two 
pivotal challenges in the realization of CNT-based integrated circuits for 
digital electronics: to provide SWCNTs with identical electronic prop- 
erties, and to integrate these SWCNTs in device architectures (which 
may be achieved by site-specific deposition of the molecular precursor 
and/or the catalyst film). However, further progress that aims to be tech- 
nologically relevant requires a process with higher growth yields that 
ideally approach unity. While we see considerable scope for optimiza- 
tion of the present process (in terms of catalyst, temperatures, pressures, 
and so on), an alternative is to replace the substrate-catalysed epitaxial 
elongation step bya more efficient process, such as hot filament chemical 
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vapour deposition*®. This would also allow more flexibility in indepen- 
dently optimizing seed formation and tube elongation while simulta- 
neously suppressing spontaneous SWCNT growth at both stages. Another 
logical extension of the process is to other SWCNTs where subtleties 
related to the chiral index of the seed may come into play. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Mercuty is a toxic, bioaccumulating trace metal whose emissions to 
the environment have increased significantly as a result of anthro- 
pogenic activities such as mining and fossil fuel combustion’. Sev- 
eral recent models have estimated that these emissions have increased 
the oceanic mercury inventory by 36-1,313 million moles since the 
1500s” ’. Such predictions have remained largely untested owing to 
a lack of appropriate historical data and natural archives. Here we 
report oceanographic measurements of total dissolved mercury and 
related parameters from several recent expeditions to the Atlantic, Pa- 
cific, Southern and Arctic oceans. We find that deep North Atlantic 
waters and most intermediate waters are anomalously enriched in 
mercury relative to the deep waters of the South Atlantic, Southern 
and Pacific oceans, probably as a result of the incorporation of an- 
thropogenic mercury. We estimate the total amount of anthropo- 
genic mercury present in the global ocean to be 290 + 80 million moles, 
with almost two-thirds residing in water shallower than a thousand 
metres. Our findings suggest that anthropogenic perturbations to the 
global mercury cycle have led to an approximately 150 per cent increase 
in the amount of mercury in thermocline waters and have tripled the 
mercury content of surface waters compared to pre-anthropogenic 
conditions. This information may aid our understanding of the pro- 
cesses and the depths at which inorganic mercury species are con- 
verted into toxic methyl mercury and subsequently bioaccumulated 
in marine food webs. 

Mercury (Hg) is emitted to the atmosphere by natural and human 
sources primarily as Hg”, which is unusually volatile for a metal". The ele- 
mental form is removed from the atmosphere after oxidation to Hg”* 
which is then deposited to land and ocean. Within the ocean, Hg** is 
readily reduced to Hg®, resulting in surface waters being supersaturated 
in the elemental form with respect to the atmosphere. With an atmo- 
spheric lifetime of between a few months and a year as well as the eva- 
sion of Hg” from the ocean to the atmosphere, Hg from any source can 
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be widely dispersed across the globe’. Hg in the ocean is also subject to 
bioaccumulation and scavenging by organic-rich particles. Such part- 
icles eventually sink out of the surface ocean and are respired at deeper 
depths, transporting carbon, nutrients and metals like Hg in the pro- 
cess. In this way, Hg is very much like carbon dioxide (CO) in that it is 
a biologically active gas that exhibits wide dispersal in the atmosphere, 
vigorous air-sea exchange and vertical transport in the ocean asa result 
of the particulate “biological carbon pump””. Like the other group-12 
elements (Zn and Cd)" we might expect Hg distributions in the ocean 
to mimic macronutrients like phosphate (PO,°  ) (low in the surface, 
increasing through the thermocline, higher in the deep Pacific than in 
the deep Atlantic). As can be seen in some representative vertical pro- 
files of Hg concentrations (Fig. 1), this general trend is indeed observed. 

However, oceanic Hg distributions are a combination of pre- 
anthropogenic, nutrient-like and transient signals resulting from human 
activities over the past several centuries. Figure 2 shows the concen- 
trations of Hg and dissolved phosphate released during organic matter 
remineralization (Premin is the apparent oxygen utilization divided by 
170; ref. 12) measured in a variety of water masses from GEOTRACES 
cruises to the North and South Atlantic Ocean, the Pacific sector of the 
Southern Ocean, a GEOTRACES Intercalibration cruise to the subtrop- 
ical northeast Pacific Ocean and non-GEOTRACES cruises to the trop- 
ical Pacific Ocean (the ‘Metalloenzyme’ cruise), the North Pacific Ocean 
(CLIVAR Repeat P16), and the central Arctic Ocean (2011 Polarstern 
cruise ARK-XXVI/3-TransArc) (refs 13-16 and K.M.M., C.H.L., G.J.S. 
& M.A.S., manuscript in preparation, but unpublished data available at 
http://www.bco-dmo.org or on request). In the water masses other 
than Northern Hemisphere North Atlantic Deep Water and recently 
subducted Antarctic Bottom Water (henceforth referred to as ‘unaffected’ 
deep waters), a striking correlation between Hg and Pyemin is seen (the 
reduced major axis regression line in Fig. 2). This correlation offers several 
important insights: (1) these water masses possess little anthropogenic 
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from WOCE data (http://www.ewoce.org). Transect (black lines) is shown to 
the right. Figure generated using Ocean Data View (http://odv.awi.de/). AIW, 
Atlantic Intermediate water; NADW, North Atlantic Deep Water; AABW, 
Antarctic Bottom Water; PDW, Pacific Deep Water. 
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Figure 2 | The concentration of Hg and P,emin in various water masses. The 
grey symbols are the data from deep waters (>1,000 m depth) not suspected 
of containing anthropogenic Hg. The remaining symbols include North 
Atlantic Deep Water in the North Atlantic Ocean (1), Antarctic Bottom Water 
sampled between Tasmania and Antarctica (2), thermocline waters from the 
North Atlantic Ocean (6), South Atlantic Ocean (4), the Southern Ocean 
between Tasmania and Antarctica (8), the Arctic Ocean (7), the northeast 
Pacific Ocean (5) and the tropical Pacific Ocean (3). 


Hg delivered by the biological pump, otherwise a good correlation and 
a y-intercept that is almost zero (—0.07 + 0.03 pmol kg’ Hg) would 
not have been observed (see Supplementary Information); (2) the slope 
of the line is an expression of the Hg/P ratio in sinking organic matter 
formed in surface waters from before the anthropogenic impact (1.02 
+ 0.03 pmol Hg per mole P); (3) the relationship between Pyemin and 
Hg allows us to use it as a benchmark against which water masses that 
do contain anthropogenic Hg can be compared. 


Table 1 | Summary of Hg, Premin and CO2 anth data 


The impact of anthropogenic Hg emissions in the deep North Atlantic 
and various thermocline water masses is evident in Fig. 2, with data 
points that lie above the unaffected deep water regression line, thus 
showing evidence of anthropogenic Hg contributions; the vertical dis- 
tance between the data and the line represents the amount of Hg in that 
water mass contributed from human sources. It is immediately appar- 
ent, however, that the degree of Hg perturbation for each water mass is 
not the same. This can be explored further by dividing the amount of 
anthropogenic Hg in each water mass by a tracer, preferably a pollutant 
that has a similar emissions history. This will allow the derived amount 
of anthropogenic mercury (Hgant) to be cross-checked against expec- 
tations as well as greatly simplify the conversion of our measurements 
to ascaled-up estimate of the total amount of pollutant Hg in the ocean. 
For this purpose, we have selected the amount of anthropogenic carbon 
dioxide (COz, anth) present in each water mass (Table 1). The COd, anth 
estimates were derived using the AC* method of Gruber and colleagues’” 
froma variety of data sets'®, and then gridded over the whole ocean (the 
GLODAP database)"*. The Hganth/COz, anth ratios in most of the water 
masses are not statistically different from either each other or the 
Hg/CO) ratio in primary anthropogenic atmospheric emissions (9.6- 
12.4 Mmol Hg per year; 0.79 + 0.04 PmolC per year; Hg/C = 14+ 
2 nmol mol” ')?°?. However, shallower water masses appear to have 
smaller mean Hg,n¢n/CO2, anth ratios than either North Atlantic Deep 
Water (NADW) or Antarctic Bottom Water, which have mean Hgantn/ 
CO, , anth ratios exceeding those in most known emissions sources”’. The 
cause of this higher ratio is unclear, but it may be attributed to either 
high localized rates of atmospheric Hg deposition due to high rates of 
precipitation (Southern Ocean), enrichment caused by salt rejection 
during sea-ice formation”, proximity to historically strong regions of 
Hg emissions in North America and Europe (North Atlantic) or the 
prevalence of coal burning as a source of CO) early in the Industrial 
Revolution. For example, surface waters near Iceland (the site of NADW 
formation) and Antarctica (Antarctic Bottom Water) are enriched in 
Hg (about 2 pM)’*”* with respect to average surface waters (0.6 pM; see 


Ocean basin; water mass Hg (pmol kg” +) 


Premin (umol kg” *) 


(Hg/CO2)anth 
(nmol mol”) 


Selection criteria 
(stations; depth) 


Hganth (pmol kg~*) COz, anth (umol kg *) 


Unaffected deep waters 


South Atlantic; North Atlantic Deep Water 0.47 + 0.08 0.52 + 0.06 0 0 NA All stations; 
1,500-4,000 m 
South Atlantic; Antarctic Bottom Water 0.67 + 0.19 0.75+0.19 0 0 NA All stations; 
below 4,000 m 
Tropical Pacific; Pacific Deep Water and Pacific 1.10 +0.33 1.13 +0.15 0 0 NA All stations; 
Bottom Water below 1,000 m 
Subtropical northeast Pacific; Pacific Deep Water 1.55+0.01 1.60 + 0.05 0 0 NA One station; 
and Pacific Bottom Water below 1,000 m 
Affected deep waters 
North Atlantic; North Atlantic Deep Water 1.1+0.2 0.40 + 0.05 0.72 +0.17 10.2 +9.7 58+ 29 All stations; 
1,500-4,000 m 
Southern Ocean; Antarctic Bottom Water 0.98 + 0.17 0.75 +0.10 0.29 + 0.20 5+4 76+8 All stations south 
of 50°S, depths 
100-1,000 m 
Thermocline waters 
South Atlantic; thermocline 0.41+0.14 0.49 +0.35 —0.02 + 0.28 25+13 NA All stations; 
100-1,000 m 
Tropical Pacific; thermocline 0.82 +0.35 0.98 +0.48 —0.10 + 0.60 16+16 NA All stations; 
100-1,000 m 
North Atlantic; thermocline 0.94 +0.27 0.47 + 0.30 0.52+0.21 42+14 15+8 All stations; 
100-1,000 m 
Northeast Pacific; thermocline 1.22 +0.39 0.92 +0.72 0.35+0.59 25+14 23+18 All stations; 
100-1,000 m 
Arctic; thermocline 1.00+0.11 0.26 + 0.03 0.80 + 0.12 ~30 ~27 Two stations; 
200-1,000 m 
Southern Ocean; thermocline 0.95 + 0.057 0.47 + 0.33 0.66 + 0.57 22+12 37+24 All stations south 


of 50°S, depths 
100-1,000 m 


Hg, Premin and COs anth Values are water mass averages. NA, not applicable. The Hganth and (Hg/COz)anth Values shown are averages of sample-by-sample calculations. 
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below), which is consistent with greater mean Hg/CO) ratios in these 
deep waters. Some alteration in Hganth/COz, anth ratios should also be 
expected from the differential behaviours in the ocean between these 
two biologically active gases (Hganth Will be moved into the thermocline 
and mode waters by both the biological as well as the solubility pump”®, 
whereas CO> anth Will not be pumped biologically because oceanic pri- 
mary productivity is not C-limited). 

We used the observed Hganth/COz, anth ratios in each affected water 
mass to estimate the inventory of Hganth in the ocean as a whole by mul- 
tiplying these ratios by the estimated amount of CO, antn in the ocean 
(9.8 + 1.6 Pmol C)””. Given the still small and evolving amount of ocean- 
ographic Hg data available, we chose to use one Hganth/COz, anth ratio for 
intermediate waters (100-1,000 m; 25 + 11 nmol mol ‘) and another 
for the deep North Atlantic (66 + 14nmol mol '); we used the GLODAP 
model estimate for the percentage of CO2, anth in each ocean layer: 15% 
in surface water, 71% in intermediate waters, and 16% in deep water. 
This calculation suggests that there are about 170 + 80 Mmol ofanthro- 
pogenic Hg between 100 m and 1,000 m depth and about 100 + 20 Mmol 
deeper than 1,000 m. 

It is not appropriate to use P,emin to identify the anthropogenic im- 
pact on Hg in waters shallower than 100 m because atmospheric depo- 
sition is the primary source of Hg to the surface ocean, not particle 
remineralization. Alternatively, we estimated Hganth in surface waters 
by comparing the slope of the regression in Fig. 2 with the Hg/P ratio in 
contemporary suspended particulate matter. The Hg/P ratio was de- 
rived from analysis of Hg and P in mixed-layer particulate matter col- 
lected by in situ pumping performed during both the North Atlantic 
‘“GEOTRACES and tropical Pacific ‘Metalloenzyme’ cruises. This ratio 
is 3.4 + 1.3 umol Hg per mole P, indicating a factor of 3.4 + 1.3 in- 
crease in the concentration of Hg in surface ocean particulate matter 
and presumably in solution as well since industrialization. This degree 
of secular change of Hg in surface waters is consistent with archives of 
atmospheric Hg deposition that indicate a 2-5-fold increase worldwide 
since industrialization™*. The data presented here suggest that the total 
amount of Hg in the top 100 m of the ocean is about 22 Mmol (an aver- 
age concentration of 0.6 pM). Accordingly, Hgantn in this layer is about 
16 + 6 Mmol. 

Our overall estimate of 290 + 80 Mmol (rounded to two significant 
figures) of Hganth in the ocean is in reasonable agreement with a num- 
ber of model-based predictions*’*”*, but suggests that the highest and 
lowest model estimates are implausible. On the high end is the predic- 
tion of Streets and colleagues’, who estimated an amount of Hganth in 
the ocean of 1,313 Mmol, which required a major contribution from 
artisanal and small-scale gold mining at present and in the past. It is im- 
portant to test this particular inventory’ because it has featured promi- 
nently in recent negotiations concerning international efforts to curb 
emissions of Hg to the environment”. Our measurements and calcula- 
tions here suggest that either the Streets* estimate for past Hg anthro- 
pogenic releases is too high, or that much of the Hg they predicted to be 
in the ocean resides elsewhere, such as in soils. Recent work by Jaegle, 
Zhang and colleagues” has provided support for this as well by using 
modelling fits to water column profiles that also suggest that loadings 
to the ocean are lower than those of Streets and colleagues”. It should be 
noted that the estimate for total CO, antn to which we have indexed’* is 
for the year 1994. Estimates for more recent times and with different 
methods suggest greater CO> anth (for example, 12.9 Pmol; ref. 28) which 
would predict higher values of Hg, ,; as well (380 Mmol). However, this 
higher estimate is still much less than that of ref. 2. 

As noted, we found that about a third of anthropogenic Hg loadings 
to the ocean are in deep water, particularly NADW. One model with 
which our results agree quite well is that of Sunderland and Mason’, who 
used a multi-box model that explicitly included deep water formation 
in the North Atlantic. In their simulation, 129 Mmol of Hganth are in 
ocean water shallower than 1,500 m, with another 124 Mmol in deeper 
waters. Thus, the prevalence of anthropogenic Hg in deep waters of the 
North Atlantic indicate the importance, as captured by the Sunderland 
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and Mason model’, of deep water formation for sequestration of surface 
Hg on millennial timescales. This observation also leads to the conclu- 
sion, given that Hg emissions from anthropogenic sources are predicted 
to increase at a rate faster than in the previous few centuries”, that 
future loadings may somewhat overwhelm the deep water formation 
sink. We should therefore expect that the rate of increase of Hg in sur- 
face waters in the next few decades will be greater than the rate of in- 
crease in emissions during the same time period. 

The impact of anthropogenic loadings on the oceanic Hg reservoir 
can be estimated with knowledge of the total amount of Hg in the ocean. 
Taking the North and South Atlantic concentration profiles each to rep- 
resent a quarter of the whole ocean and the Pacific profiles to represent 
the other half, we estimated that the ocean contains 1,390 Mmol of dis- 
solved total Hg, with 22 Mmol in the 0-100 m surface ocean, 292 Mmol 
in the 100-1,000 m intermediate depths and 1,260 Mmol in waters dee- 
per than 1,000 m (the average concentration in these three layers being 
0.6 pM, 0.9 pM and 1.0 pM, respectively’*""* (G.J.S., CH.L., MJ.A.R. & 
C.R.H., manuscript in preparation)). These amounts are smaller than 
most previous estimates; for example, Sunderland and Mason’ estimated 
666 Mmol in water shallower than 1,500 m and 1,095 Mmol in deeper 
water. Thus, analysis of the new data presented here suggests that the 
relative impact of human Hg emissions on the ocean is greater than 
previously thought: waters shallower than 1,000 m appear to have con- 
tained 120 Mmol in the pre-industrial past, and exhibit a factor-of-2.6 
increase, while the ocean as a whole has experienced a factor-of-1.1 
increase. 

As our analysis reveals, and as has been noted elsewhere”’, the impact 
of human Hg emissions is not uniform within the ocean. Therefore, the 
extent to which methyl mercury concentrations in fish have changed 
since industrialization, and might change in response to further per- 
turbation (perhaps as much as a fivefold increase over pre-industrial 
levels by 2050)”’ can be determined only following studies of the ver- 
tical patterns in Hg methylation dynamics as well as basin-scale con- 
trols on methylation of anthropogenic Hg. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Water samples were collected using ultraclean techniques”, including the use of a 
largely metal-free collection system and pressure filtration to 0.45 ,um of water sam- 
ples directly from the sampling GO-Flo bottles. Aliquots for total ‘dissolved’ Hg 
were collected in 250-ml, acid-washed, borosilicate glass bottles, oxidized with BrCl 
and analysed by cold vapour atomic fluorescence spectrometry following SnCl, 
reduction and gold-trap pre-concentration*®” **. 

Premin was calculated according to Anderson and Sarmiento” as the apparent 
oxygen utilization divided by 170 + 10, where the apparent oxygen utilization is 
calculated as [O2] ,aturatea Minus [O2] observed» Where [O2] saturated is determined from 
depth, temperature and salinity*’. 

Particulate Hg and P were determined from subsamples of quartz-fibre or poly- 
ethersulphone filters loaded with suspended matter (<51 jim) using McLane pumps. 
For Hg, the filter aliquots were digested with 2 M HNOs, and the digest treated as 
dissolved samples™*. For P, polyethersulphone filter subsamples were digested in a 
3:1 sulphuric acid:hydrogen peroxide solution to oxidize and dissolve the poly- 
ethersulphone filter, dried down, and then particles were digested in a 4 M HCl/ 
HNO,/HF mixture®. The digest was analysed for multiple elements including P 
ona high-resolution inductively coupled plasma mass spectrometer and standar- 
dized using multi-element external standards (similar to ref. 36). 

Water masses were defined primarily based on depth (as noted in Table 1), in ac- 
cordance with those suggested by Talley and colleagues*’. This definition represents 
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an approximation for more refined definitions made on the basis of temperature, 
salinity and basin. 
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Negative regulation of the NLRP3 inflammasome by 
A20 protects against arthritis 
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Rheumatoid arthritis is a chronic autoinflammatory disease that 
affects 1-2% of the world’s population and is characterized by wide- 
spread joint inflammation. Interleukin-1 is an important mediator 
of cartilage destruction in rheumatic diseases’, but our understand- 
ing of the upstream mechanisms leading to production of interleu- 
kin-1B in rheumatoid arthritis is limited by the absence of suitable 
mouse models of the disease in which inflammasomes contribute to 
pathology. Myeloid-cell-specific deletion of the rheumatoid arthritis 
susceptibility gene A20/Tnfaip3 in mice (A20™”*° mice) triggers a 
spontaneous erosive polyarthritis that resembles rheumatoid arth- 
ritis in patients. Rheumatoid arthritis in A20””"*° mice is not res- 
cued by deletion of tumour necrosis factor receptor 1 (ref. 2). Here 
we show, however, that it crucially relies on the NIrp3 inflammasome 
and interleukin-1 receptor signalling. Macrophages lacking A20 have 
increased basal and lipopolysaccharide-induced expression levels of 
the inflammasome adaptor Nlrp3 and proIL-1f. As a result, A20- 
deficiency in macrophages significantly enhances Nirp3 inflamma- 
some-mediated caspase-1 activation, pyroptosis and interleukin-1p 
secretion by soluble and crystalline NIrp3 stimuli. In contrast, acti- 
vation of the NIrc4 and AIM2 inflammasomes is not altered. Impor- 
tantly, increased NIrp3 inflammasome activation contributes to the 
pathology of rheumatoid arthritis in vivo, because deletion of NIrp3, 
caspase-1 and the interleukin-1 receptor markedly protects against 
rheumatoid-arthritis-associated inflammation and cartilage destruction 
in A207" mice. These results reveal A20 as a novel negative reg- 
ulator of NIrp3 inflammasome activation, and describe A20”” BEKO 
mice as the first experimental model to study the role of inflamma- 
somes in the pathology of rheumatoid arthritis. 

A20 was deleted in myeloid cells by crossing A204" mice into lyso- 
zyme M (LysM)-Cre-recombinase-expressing mice. Unlike wild-type 
macrophages, A20”””*° macrophages failed to induce A20 messen- 
ger RNA (mRNA) and protein expression in response to the Toll-like 
receptor-4 (TLR4) ligand lipopolysaccharide (LPS) (Fig. 1a, b), dem- 
onstrating the effectiveness of LysM-driven deletion of A20 in myeloid 
cells. Arthritis development in A20””"*° mice was shown to be inde- 
pendent of tumour necrosis factor receptor 1 (TNF-R1), whereas dele- 
tion of MyD88 markedly protected against pathology of rheumatoid 
arthritis’. As this signalling adaptor operates downstream of both TLRs 
and interleukin-1 receptor (IL-1R), we crossed IIr1 ~/~ mice into 
A20"**° mice to assess the contribution of IL-1 signalling to arthritis 
pathogenesis. As expected, wild-type mice (A20/°""*Il1r1*/*) did not 
develop arthritis, whereas A20””"“° mice spontaneously developed an 
arthritic phenotype (Fig. 1c). The incidence of A20””” *° mice devel- 
oping arthritis was 100% (Fig. 1d). In sharp contrast, A20"”" “°Ilr1~/~ 
mice were virtually devoid of clinical signs of arthritis (Fig. 1c, d). In 
agreement, clinical scoring of disease severity confirmed A20”**° 
Tlir1*/* mice as developing severe arthritic disease (high clinical score) 
whereas A20””*“°IlIr1 ’ mice were markedly protected (clinical 


score 0). A20””**° littermates that were heterozygous for IL-1R1 ex- 
pression showed an intermediate arthritic phenotype between those of 
A207 71*/* and A201 11 mice (Fig. 1d, e). These clin- 
ical assessments were supported by a histological examination of rep- 
resentative ankle joints. Tissue sections of diseased A20"”" “Ir 1*/~ 
mice stained by haematoxylin and eosin showed significant synovial 
and periarticular inflammation and high levels of infiltrated mononu- 
clear cells, which was associated with extensive cartilage and bone de- 
struction (Fig. 1f). In marked contrast, ankle joints of A220" “°TlIr1/~ 
littermates were strongly protected from arthritic histopathology and 
contained significantly reduced numbers of infiltrating inflammatory 
cells (Fig. 1f). Semi-quantitative scoring of these histological para- 
meters confirmed that the severity of arthritis was substantially lower 
in A20”*“°Il1r1-/~ mice relative to A20"” “°ll1r1*’~ littermates 
(Fig. 1g and Extended Data Table 1). These results demonstrate that 
IL-1 production is detrimental for arthritis pathogenesis in mice with a 
myeloid cell-restricted deletion in A20. 

Macrophages are a prime source of proIL-1f, and generally depend 
on caspase-1 for maturation and secretion of the biologically active 
cytokine. Caspase-1 is produced as a cytosolic zymogen, the activation 
of which is controlled by different inflammasomes’. To study the role 
of A20 in inflammasome signalling, we assessed caspase-1 processing 
in bone-marrow-derived macrophages (BMDMs) of wild-type and 
A20""*'®° mice. Notably, caspase-1 activation was substantially increased 
in LPS-primed A20””“° macrophages that were treated with soluble 
(ATP and nigericin) or crystalline (silica) stimuli of the Nlrp3 inflam- 
masome compared with wild-type BMDMs (Fig. 2a). Concurrently, 
the levels of secreted IL-1 in the culture medium of ATP- and nigericin- 
treated A20"”""*° macrophages were about twice those of wild-type 
cells, and silica triggered nearly four times higher levels of secreted IL- 
1B (Fig. 2b). Enhanced caspase-1 autoprocessing (Fig. 2c, d) and IL-18 
secretion (Fig. 2e, f) by the NIrp3 inflammasome was evident within 
10 min after ATP or nigericin addition, and continued to increase in a 
time-dependent fashion. Similarly, the induction of caspase-1-dependent 
pyroptosis was enhanced in A20”””“° macrophages (Fig. 2g, h). Despite 
their hypersensitivity for NIrp3 inflammasome activation, A20"”* *° 
macrophages failed to process caspase-1 and secrete IL-1B and IL-18 
when treated with LPS, ATP or nigericin alone (Extended Data Fig. 1). 
The increased responsiveness of A20”””“° macrophages towards inflam- 
masome activation was restricted to the Nlrp3 inflammasome because 
caspase-1 processing and pyroptotic cell death by the Nlrc4 inflamma- 
some were similarly induced in wild-type and A20”“° macrophages 
that had been infected with Salmonella enterica serovar Typhimurium 
(Fig. 2i, j). Similarly, stimulation of the AIM2 inflammasome with cyto- 
solic double-stranded DNA (dsDNA) did not result in differential levels 
of caspase-1 processing and pyroptosis induction in wild-type and 
A20"vtKO macrophages (Fig. 2k, 1). It is worth noting, however, that 
despite normal caspase-1 activation and pyroptosis levels in response 
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Figure 1 | Il1rl deficiency rescues the arthritis phenotype of A20"”"*° 
mice. a, b, A20 and B-actin mRNA (a) and protein (b) levels of LPS-stimulated 
BMDMs. c, Hind paws of 20-week-old mice. d, e, A20 rit /* (n= 19), 
A20* nt r1*!* (n= 8), A207 Tr (n = 15) and A20""*° 
IlIrl’~ (n = 9) mice aged 21-30 weeks were clinically scored for arthritis 


to S. enterica serovar Typhimurium infection and dsDNA transfection, 
secretion of IL-1 in response to these treatments was consistently higher 
in A20””""*° macrophages compared with wild-type controls (Fig. 2m, n), 
which could be explained by increased induction of proIL-1B mRNA 
in A20”*'*° macrophages (data not shown). Together, these results 
demonstrate that A20 negatively regulates activation of caspase-1 by the 
Nlrp3 inflammasome, but not by the Nlrc4 and AIM2 inflammasomes. 
Activation of the Nlrp3 inflammasome in wild-type macrophages is 
tightly regulated at different levels. A priming signal (referred to as step 
1 and usually provided by TLRs) upregulates NIrp3 expression levels along 
with the inflammasome substrate proIL-1f via the pro-inflammatory 
transcription factor NF-«B*. A20 negatively regulates LPS-induced 
NF-«B activation (refs 5-8 and Extended Data Fig. 2a), which was also 
reflected in increased secretion of the NF-«B-dependent cytokines IL-6 
and TNF in A20””*° macrophages (Extended Data Fig. 2b, c). We 
further showed A20-deficiency to markedly enhance LPS-induced mRNA 
expression levels of Nlrp3 (Fig. 3a) and proIL-1f (Fig. 3b). In contrast, 
LPS-induced transcript levels of caspase-1 and the inflammasome adaptor 
ASC were respectively mildly upregulated and normal in A20"”*"*° 
macrophages (Extended Data Fig. 3a, b). Analysis of protein expression 
levels confirmed Nlrp3 and proIL-1f to be significantly higher in LPS- 
primed A20””"*° macrophages than wild-type cells, whereas caspase-1 
and ASC were not differentially modulated in the two genotypes (Fig. 3c). 
TLR stimulation during brief time intervals (10 min or less) was re- 
cently shown to license activation of the Nlrp3 inflammasome indepen- 
dently of new protein synthesis” ’”. Rapid Nlrp3 inflammasome activation 
resulted in procaspase-1 processing and secretion of pre-synthesized 
proIL-18 in the absence of the NF-«B-dependent cytokines IL-1B, TNF 
and IL-6 (refs 9-11). To address whether A20 modulated rapid Nlrp3 
inflammasome activation, cells were exposed to ATP or nigericin after 
being primed with LPS for 10 min. As reported’, wild-type BMDMs 
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incidence (d) and severity (e). f, Ankle joint sections stained with haematoxylin 
and eosin; magnification X40 (top) and < 100 (bottom). g, Histological scores 
of ankle sections of A20"**°lir1*/— (n = 10) and A20"**°Tli1r11-/— 

(n = 8). P values in e and g were determined by Student's t-test. 


activated caspase-1 (Fig. 3d) and secreted significant amounts of IL- 18, 
but not IL-1, TNF or IL-6 (Fig. 3e). Moreover, we noted Nlrp3 protein 
levels were lowered after stimulation both in wild-type and A20"”* *° 
macrophages (Fig. 3d), supporting the notion that acute NIrp3 inflam- 
masome activation occurred independently of LPS-induced new protein 
synthesis. Both caspase-1 processing and IL- 18 secretion were markedly 
increased in A20””"*° macrophages in the absence of substantial IL- 
18, TNF and IL-6 secretion (Fig. 3d, e). This was probably due to in- 
creased basal expression of NIrp3 and proIL-18 in these cells (Fig. 3c, d). 
In agreement, basal mRNA levels of NIrp3, proIL-1f and proIL-18 were 
increased in untreated A20”””"“° macrophages (Fig. 3a, b and Extended 
Data Fig. 3c). Together, these results suggest that A20 negatively regu- 
lates NIrp3 inflammasome signalling by suppressing NF-«B-dependent 
production of Nlrp3 and the inflammasome substrates proIL-18 and 
proIL-18. In agreement, the pharmacological inhibitor ofkappa B kinase 
(IKK) inhibitor BMS-345541 significantly reduced NIrp3 levels in LPS- 
primed A20”*"*° macrophages (Fig. 3f). Moreover, both BMS-345541 
and the selective IKK2 inhibitor TCPA-1 significantly reduced ATP- 
and nigericin-induced caspase-1 autoprocessing, IL-1B secretion and 
pyroptosis induction in LPS-primed A20”” *° macrophages (Fig. 3g-i). 
Having established A20 as a negative regulator of Nlrp3 inflamma- 
some activation, we hypothesized that excessive Nlrp3 activation might 
drive pathology of rheumatoid arthritis in A20-deficient mice upstream 
of IL-1R1. To test this hypothesis, Nirp3 ‘~ mice were crossed into 
A20%*° mice and the levels of four cytokines (IL-1o, IL-1B, IL-6 and 
TNF) relevant to rheumatoid arthritis were monitored. Although IL- 
1a levels were not significantly different in A20-sufficient and A20"”" *° 
mice (Extended Data Fig. 4a), the latter group of mice had significantly 
higher levels of IL-1 in circulation (Fig. 4a). In addition, the levels of 
IL-6 and TNF were also significantly higher in A20"”” *° mice com- 
pared with A20/°*f©* littermates (Extended Data Fig. 4b, c). Notably, 
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Figure 2 | Hyperactivation of the NIrp3 but not the NIrc4 and AIM2 
inflammasomes, in A20-deficient macrophages. a—h, BMDMs were 
stimulated as described in Methods. Lysates were immunoblotted for caspase-1 
(a, c, d) and supernatants analysed for IL-1 (b, e, f) and LDH (g, h). Nig, 
nigericin. i-n, BMDMs were treated as described in Methods. Lysates were 
immunoblotted for caspase-1 (i, k) and supernatants analysed for LDH 

(j, 1) and IL-1B (im, n). Black arrow, procaspase-1; white arrow, p20. MOI, 
multiplicity of infection. Data represent mean ~ s.d. of one out of three 
biological replicates, with three technical replicates each (*P < 0.05; 

*** PD < 0,001; Student's t-test). 


deletion of NIrp3 in A20"”” *° mice markedly reduced IL-1 secretion 
to baseline levels of A20/"* mice, thereby demonstrating that the 
Nlrp3 inflammasome contributes critically to excessive IL-1B produc- 
tion in A20"”"*° mice in vivo (Fig. 4a). Intriguingly, A20”"" “°NIrp3‘~ 
mice also were protected from excessive IL-6 production, suggesting 
that high IL-6 levels are consequent to excessive inflammasome-mediated 
IL-1B production (Extended Data Fig. 4b). In contrast, TNF production 
was not significantly affected in A20"”" “CNIrp3 ‘~ mice (Extended 
Data Fig. 4c). In agreement, TNF-R1 signalling was previously shown 
to be dispensable for pathology of rheumatoid arthritis in A20"”**° 
mice, whereas IL-6 neutralization provided protection’. 

Based on these findings, we assessed the contribution of Nlrp3 to 
pathogenesis of rheumatoid arthritis in A20’"”"*° mice. Swelling and 
redness of the hind paws of A20””*"“°NIrp3*/* mice became evident 
around 11 weeks of age (Fig. 4b), and had afflicted all animals of this 
genotype when they were 20 weeks old (Fig. 4c). Disease severity con- 
tinued to progress, and became increasingly pronounced in A20”"*° 
Nirp3 +/+ mice aged 21-40 weeks (Fig. 4d). In contrast, A20”™ KONI p3!— 
mice of similar age were markedly protected from rheumatoid arthritis, 
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Figure 3 | A20 inhibits Nirp3 inflammasome priming. a, b, NIrp3 (a) and 
proIL-1B (b) mRNA levels of LPS-treated BMDMs. ¢, Expression of the 
indicated proteins in BMDMs 6h after LPS treatment. d, e, Rapid Nlrp3 
inflammasome activation as described in Methods. Expression of the indicated 
proteins (d) and secreted cytokines (e) were determined. f-i, A20”” ko 
BMDMs were treated as indicated. Nlrp3 mRNA levels (f), caspase-1 
expression (g), secreted IL-1 (h) and LDH activity (i) were determined. 
Black arrow, procaspase-1; white arrow, p20. Data represent mean = s.d. of 
one out of three biological replicates, with three technical replicates each 

(*P <0.05; ***P < 0.001; Student’s t-test). 


and their hind paws had a normal appearance and lacked clinical signs 
of pathology of rheumatoid arthritis (Fig. 4b-d). Histological analy- 
sis of the ankle joints of these mice showed significantly reduced syno- 
vial and periarticular inflammation, and substantially less infiltrated 
mononuclear cells compared with tissue sections of A20””" “°NIrp3*/* 
mice of comparable age (Fig. 4e, fand Extended Data Table 2). In agree- 
ment, three-dimensional micro-computed tomography imaging showed 
that the extent of bone erosion in hind paws of representative A20"”" “° 
Nirp3*/* and A20"" Ni Inp3’ ~ mice was markedly different. Unlike 
in NIrp3-deficient A20””"“° mice, hind paws of NIrp3-sufficient mice 
exhibited severe loss of bone density in the metatarsal region (Fig. 4g), 
demonstrating a key role for Nlrp3 in the pathology of rheumatoid 
arthritis. 

We also analysed the impact of caspase-1/11 deficiency on IL-1B 
secretion and pathology of rheumatoid arthritis in A20"”*° mice. IL- 
1f levels in circulation were significantly reduced in A20"”* “°Casp1/ 
11’ mice compared with A20"”"“°Casp1/11*/* mice (Fig. 4h). Asin 
A20"*K°NIrp3/~ mice, serum levels of IL-6 were markedly reduced 
in A20"“°Casp1/11-’~ mice, whereas TNE production was not sig- 
nificantly different compared with A20"”” “°Casp1/11*/* mice (Ex- 
tended Data Fig. 4). Moreover, hind paws of all analysed A20"”*° 
Casp1/11*’* mice were clearly inflamed and swollen (Fig. 4i-k). In 
contrast, 50% of A20””*"“°Casp1/11 ’~ mice were devoid of clinical 
signs of arthritis, and disease symptoms in the remaining A20"”" *° 
Casp1/11 -’~ mice were very mild (Fig. 4i-k). In agreement, analysis of 
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Figure 4 | Nirp3 and caspase-1 deletion rescues arthritis in A207" *° 


mice. a, IL-1f serum levels of A20"'NIrp3*/ * (n= 10), A20" NI rp3*/ a 
(n = 10) and A20”*“°NIrp3~/~ (n = 10) mice aged 20-35 weeks. b, Hind 
paws of 30 week old mice. c, d, A20”” KON Tp 3* * (n = 20) and 

A20”™ KONI rp3/ ~ (n = 17) mice were scored for arthritis incidence (d) and 
severity (e). e, Representative ankle joint sections; magnification 40 (top) 
and X100 (bottom). f, Histological scores of ankle sections of 
A20"NIrp3*/* (n= 5) and A20"”" *°NIrp3/— (n = 5) mice. g, micro- 
computed tomography images of hind paws of A20"”*°Nirp3*’* and 
A20”"™ KONI rp3 / ~ mice. Solid arrow, bone erosion; empty arrow, intact 
cartilage and bone. h, IL-1 serum levels of A20™"'Casp1/11 sia (n= 11), 
A20""*°Caspi/11*/* (n = 26) and A20"" *°Casp1/11-‘~ (n = 16) mice 
aged 20-35 weeks. i, Hind paws of 25 weeks old mice. j, k, 15-30 weeks old 
A20""°Casp1/11** (n= 12), A20""°Casp1/11*’— (n = 15) and 
A20"**°Casp1/11-‘— (n= 24) mice were clinically scored for arthritis 
incidence (j) and severity (k). 1, Representative ankle joint sections; 
magnification X10 (top) and X20 (bottom). P values were determined by 
Student’s t-test (a, d, f, k) and Mann-Whitney U test (h). 


joint sections of arthritic A20"”" “°Casp1/11*’~ mice stained with hae- 
matoxylin and eosin showed massive mononuclear cell infiltration asso- 
ciated with marked articular, periarticular and synovial inflammation 
and reactive fibrosis. In contrast, joints of A20””* “°Casp 1/11 ‘~ mice 
were markedly protected from the histopathology of rheumatoid arth- 
ritis (Fig. 41). 

Collectively, these results demonstrate that A20 puts a brake on Nlrp3 
inflammasome activation by reducing basal and LPS-induced Nlrp3 
expression levels (Extended Data Fig. 5). Moreover, we showed that 
excessive Nlrp3 inflammasome activation drives arthritis pathogenesis 
in A20”**° mice. Arthritis in humans is a complex disease that may 


72 | NATURE | VOL 512 | 7 AUGUST 2014 


be caused by different combinations of genetic and environmental trig- 
gers. As such, available mouse models of rheumatoid arthritis may each 
be relevant to distinct subsets of patients with rheumatoid arthritis, or 
suitable for studying certain aspects of the pathogenesis of rheumatoid 
arthritis. Pathology in the widely used collagen- and antigen-induced 
mouse arthritis models occurs independently of inflammasome signal- 
ling’*"*. In contrast, we demonstrated here that arthritis pathology in 
A20"°*° mice critically relies on the NIrp3 inflammasome/IL-1 sig- 
nalling axis. Because both A20/Tnfaip3'*"* and Nirp3' are rheum- 
atoid arthritis susceptibility genes, this suggests that A20””"*° mice 
might bea suitable pre-clinical model for validating the effectiveness of 
experimental therapies for rheumatoid arthritis targeting inflamma- 
somes and/or IL-1 signalling. 


METHODS SUMMARY 

Mice. Nirp3 ~~ (ref. 22), Casp1/11 ~~ (ref. 23), Hrl/~ (ref. 24) and A20"”*° mice? 
with a lysozyme M-Cre-targeted deletion of A20 in myeloid cells were as described. 
BMDM studies. BMDMs were isolated and the Nlrp3 inflammasome was acti- 
vated by LPS in combination with ATP (Roche), nigericin (Sigma-Aldrich) or Silica 
(US Silica). The AIM2 or Nlrc4 inflammasomes were activated by infection with 
Salmonella enterica serovar Typhimurium or transfection with plasmid DNA, 
respectively. 

Antibodies. The following antibodies were used: A20 (Santa Cruz Biotechnology), 
caspase-1, Nlrp3, ASC (Adipogen), IL-1B (Genetex), IL-18 (Biovision), IkBo, 
Phospho-IkBa (Ser32) (Cell Signaling) and B-actin (Novus Biologicals). 
Cytokine analysis and lactate dehydrogenase measurement. Cytokine levels 
in culture medium of BMDMs and in serum were determined by Luminex assays 
(Bio-Rad) and IL-1 enzyme-linked immunosorbent assay (ELISA) (R&D Systems). 
Lactate dehydrogenase (LDH) activity (Promega) was used to quantify pyroptosis. 
Reverse transcription PCR. RNA was isolated using RNeasy kit (Qiagen) and 
mRNA levels were determined by quantitative reverse transcription PCR (qRT- 
PCR). 

Clinical scoring. Mice were randomly scored in a blinded fashion for develop- 
ment of peripheral arthritis. The severity of arthritis was assessed using a visual 
scoring system. 

Histology and histological scoring. Paraffin sections of murine paws were stained 
with haematoxylin and eosin for evaluation of inflammation and bone erosion. 
Histological scores were based on evaluation of two parameters, calcaneal erosion 
and inflammation at the synovio-entheseal complex, each ranging from 0 (normal) 
to 3. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Rapid seeding of the viral reservoir prior to SIV 
viraemia in rhesus monkeys 
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The viral reservoir represents a critical challenge for human immuno- 
deficiency virus type 1 (HIV-1) eradication strategies’. However, it 
remains unclear when and where the viral reservoir is seeded during 
acute infection and the extent to which it is susceptible to early anti- 
retroviral therapy (ART). Here we show that the viral reservoir is seeded 
rapidly after mucosal simian immunodeficiency virus (SIV) infection of 
rhesus monkeys and before systemic viraemia. We initiated suppressive 
ART in groups of monkeys on days 3, 7, 10 and 14 after intrarectal 
SIVmacosi infection. Treatment with ART on day 3 blocked the emer- 
gence of viral RNA and proviral DNA in peripheral blood and also 
substantially reduced levels of proviral DNA in lymph nodes and gastro- 
intestinal mucosa as compared with treatment at later time points. In 
addition, treatment on day 3 abrogated the induction of SIV-specific 
humoral and cellular immune responses. Nevertheless, after disconti- 
nuation of ART following 24 weeks of fully suppressive therapy, virus 
rebounded in all animals, although the monkeys that were treated on 
day 3 exhibited a delayed viral rebound as compared with those treated 
on days 7, 10 and 14. The time to viral rebound correlated with total 
viraemia during acute infection and with proviral DNA at the time of 
ART discontinuation. These data demonstrate that the viral reservoir 
is seeded rapidly after intrarectal SIV infection of rhesus monkeys, 
during the ‘eclipse’ phase, and before detectable viraemia. This strik- 
ingly early seeding of the refractory viral reservoir raises important 
new challenges for HIV-1 eradication strategies. 

The viral reservoir in memory CD4* T cells in HIV-1-infected indi- 
viduals cannot be eliminated by current antiretroviral drugs or HIV-1- 
specific immune responses’ °. This archive of replication-competent virus 
is the source of viral rebound in nearly all HIV-1-infected individuals who 
discontinue ART*® and represents a critical hurdle for HIV-1 eradication 
strategies®’. The temporal dynamics of seeding the viral reservoir have not 
been previously defined but have been presumed to occur during peak 
viraemia in acute HIV-1 infection. To evaluate the impact of early ART on 
the viral reservoir, we initiated suppressive ART at various time points 
after mucosal SIV infection of rhesus monkeys. 

We inoculated 20 Indian-origin adult rhesus monkeys (Macaca mulatta) 
that did not express the protective major histocompatibility complex 
(MHC) class I alleles Mamu-A*01, Mamu-B*08 and Mamu-B* 17 with 
500 median tissue culture infective doses (TCIDs59) of SIVacosi (refs 8-10) 
by the intrarectal route. We initiated ART on days 3, 7, 10 and 14 after 
infection with a pre-formulated cocktail of tenofovir, emtricitabine and 
dolutegravir (see Methods), and a control group received no ART (n = 4 
per group). ART was administered daily by subcutaneous injection for 
24 weeks. Treatment on day 3 after infection resulted in no detectable 
viraemia (<50 RNA copies ml ')" at any time point in four of four mon- 
keys (Fig. 1a). In contrast, treatment on days 7, 10 and 14 abruptly inter- 
rupted the exponential growth of the virus and reduced plasma viral RNA 
to undetectable levels within 3-4 weeks. The mean levels of plasma viral 


RNA at the time of ART initiation in these groups of monkeys were 
5.88 log copies ml‘ (day 7), 7.11 log copies ml ' (day 10) and 7.50 log 
copies ml ' (day 14), which were comparable with the levels of plasma 
viral RNA in untreated controls at these time points (Fig. 1b). Viral dynamics 
modelling” revealed an initial exponential growth rate of 1.5 + 0.5 per 
day, corresponding to a basic reproductive ratio of Ro = 9.5 + 5.1 (see 
Methods; Extended Data Fig. 1 and Extended Data Table 1). An expo- 
nential decay rate of plasma viraemia after ART initiation of 0.60 + 0.17 
per day was observed in all the treated groups, corresponding to a 1.3 + 
0.4 day half-life of infected cells (Extended Data Fig. 1). 

After initial control of viraemia, all animals treated with ART exhib- 
ited undetectable plasma viral loads (<50 RNA copies ml’) for the full 
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Figure 1 | Viral decay kinetics after treatment with ART. a, b, Log plasma 
viral RNA (copies ml ') in rhesus monkeys infected with SIV\acos; and after 
initiation of ART on days 3, 7, 10 and 14 of infection (a) or with no ART (b). 
Assay sensitivity is 50 RNA copies ml’. Red arrows indicate initiation of ART. 
Black dots below x-axis indicate sampling time points. 
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24-week course of suppressive therapy with no detectable viral blips 
(Fig. la), demonstrating the potency and consistency of this ART regi- 
men. Moreover, ultrasensitive plasma viral load assays at week 20 also 
proved negative (<6 RNA copies ml ')' inall animals (Extended Data 
Fig. 2). In addition, viral sequences from stimulated peripheral blood 
mononuclear cells (PBMCs) from ART-suppressed, SIV-infected mon- 
keys using the same ART regimen revealed no viral sequence evolution 
over 6 months in a separate study (J.B.W., unpublished observations). 
Furthermore, treatment intensification studies in other SIV-infected 
rhesus monkeys in which the protease inhibitor darunavir was added 
to the current ART regimen did not lead to enhanced virological control 
(R.G., unpublished observations). Taken together, these data indicate that 
the ART regimen that was used in the present study was fully suppressive. 

We next assessed the development of SIV-specific humoral and cel- 
lular immune responses in these animals. Monkeys treated on day 3 after 
infection developed no detectable SIV Env-specific antibody responses by 
enzyme-linked immunosorbent assay (ELISA) (Fig. 2a) and no detect- 
able SIV Env-, Pol- or Gag-specific T lymphocyte responses by interferon 
(IFN)-y ELISPOT assays (Fig. 2b) at weeks 4, 10, and 20 or 24 of infection. 
In contrast, monkeys treated on days 7, 10 and 14 developed detectable 
but lower SIV-specific humoral and cellular immune responses as com- 
pared with untreated controls, presumably as a result of reduced anti- 
genic stimulus after ART initiation. Multiparameter intracellular cytokine 
staining (ICS) assays confirmed that SIV Gag-specific CD8* and CD4* T 
lymphocyte responses were undetectable in animals treated on day 3 and 
were lower in animals treated on days 7, 10 and 14 as compared with 
untreated controls (Extended Data Figs 3 and 4). Gag-specific CD8* and 
CD4* T lymphocytes in monkeys treated on days 7, 10 and 14 also 
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Figure 2 | SIV-specific humoral and cellular immune responses during 
ART. a, b, Env-specific ELISA antibody titres at weeks 0, 4, 10 and 24 (a) and 
Env-, Pol-, and Gag-specific IFN-y ELISPOT responses at weeks 0, 4, 10 and 20 
in SIV-infected monkeys that initiated ART on days 3, 7, 10 and 14 of infection 
or with no ART (b). Mean responses are shown (N = 4 animals per group). 
SFC, spot-forming cells. Error bars show standard error of the mean (s.e.m.). 
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exhibited reduced immune activation and proliferation as measured by 
Ki67 expression (Extended Data Fig. 4). These data demonstrate that 
initiation of ART on day 3 blocked the emergence of plasma viraemia 
and abrogated the induction of SIV-specific humoral and cellular immune 
responses. 

We next determined the impact of early ART on levels of proviral 
DNA” in PBMCs, lymph node mononuclear cells (LNMCs) and gastro- 
intestinal mucosa mononuclear cells (GMMCs) over the course of 24 
weeks of treatment with suppressive ART. In monkeys that initiated 
ART on day 3, there was a striking anatomical discordance with no proviral 
DNA detected in PBMCs at any time point (<3 DNA copies per 10° cells) 
(Fig. 3a). In contrast, clear but low levels of proviral DNA were detected in 
inguinal LNMCs and in colorectal GMMCs in these animals, although 
proviral DNA declined to undetectable or nearly undetectable levels in 
three of four of these animals by week 24. In monkeys treated with ART 
on days 7, 10 and 14, proviral DNA was readily detected in PBMCs as well 
as in LNMCs and GMMCs (Fig. 3b-d). Moreover, in animals treated 
with ART on days 10 and 14, proviral DNA in LNMCs appeared to 
stabilize by week 12 with minimal subsequent decline, consistent with a 
stable viral reservoir (Fig. 3c, d). In untreated animals, proviral DNA 
was markedly higher than in ART-treated animals with minimal decline 
over time (Fig. 3e). Analysis of sorted cell subpopulations demonstrated 
that proviral DNA was found primarily in central memory and trans- 
itional memory CD4* T lymphocytes in lymph nodes on day 3 and in 
both PBMCs and lymph nodes on day 7 after SIV infection (Extended 
Data Fig. 5). 
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Figure 3 | Proviral DNA during ART. a-e, Log proviral DNA (copies per 10° 
cells) in PBMCs, LNMCs and GMMCs in monkeys that initiated ART on days 3 
(a), 7 (b), 10 (c) and 14 (d) of infection or with no ART (e). f-h, Comparisons 
of mean levels of log proviral DNA per 10° cells at the time of ART 
discontinuation (week 24) in ART-treated and untreated (NT) monkeys in 
PBMCs (f), LNMCs (g) and GMMCs (h) are also shown (N = 4 animals per 
group). Assay sensitivity is 3 DNA copies per 10° cells. P values reflect 
one-sided t-tests. NS, not significant. Error bars show standard deviation. 
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These data indicate that initiation of ART on day 3 reduced levels of 
proviral DNA at week 24 by at least 2.2 log in PBMCs (P < 0.0001; Fig. 3f), 
1.0 log in LNMCs (P = 0.004; Fig. 3g), and 0.9 log in GMMCs (P = not 
significant; Fig. 3h) as compared with initiation of ART on day 14. The 
high variability in the GMMC samples probably reflected sampling vari- 
ation in this anatomical compartment. Compared to untreated animals, 
initiation of ART on day 3 reduced levels of proviral DNA at week 24 by 
at least 3.3 log in PBMCs (P < 0.0001; Fig. 3f) and 3.1 log in LNMCs (P< 
0.0001; Fig. 3g). 

At week 24, ART was discontinued, and monkeys were monitored twice 
weekly for evidence of viral rebound. Viral rebound, defined as plasma viral 
RNA >50 copies ml ~ occurred in all animals (Fig. 4a). In particular, viral 
rebound occurred in four of four animals that initiated ART on day 3, 
albeit with threefold delayed kinetics as compared with animals that 
initiated ART at later time points (median 21 days to viral rebound in 
the day 3 treated animals compared with 7 days in the day 14 treated animals; 
P< 0.001; Fig. 4b). The median log setpoint viral load after rebound, defined 
as viral loads on days 56-112 following ART discontinuation, was also 
1.04 log RNA copies ml * lower for all the ART-treated animals as com- 
pared with untreated controls (4.59 log RNA copies ml’ for all treatment 
groups combined versus 5.63 log RNA copies ml for untreated monkeys; 
P= 0.01; Fig. 4c), suggesting that there is a benefit to early ART, although 
no significant differences were observed among setpoint viral loads in the 
day 3, 7, 10 and 14 treatment groups. Setpoint viral loads in the untreated 
animals were comparable with historical controls”"®. Taken together, these 
data show that the persistent viral reservoir was seeded by day 3 of infec- 
tion and led to viral rebound in all monkeys after ART discontinuation. 

To gain mechanistic insight into the kinetics of the viral rebound, we 
used viral dynamics modelling’*’* (see Methods; Extended Data Fig. 6 
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Figure 4 | Viral rebound kinetics after ART discontinuation. a, Log plasma 
viral RNA (copies ml!) in animals after discontinuation of ART at week 24 in 
monkeys that initiated ART on days 3, 7, 10 and 14 of infection. Assay sensitivity 
is 50 RNA copies ml}. b, Times to viral rebound, defined as the first time point 
at which plasma viral RNA was >50 copies ml’. Red bars indicate median 
times to viral rebound. ¢, Setpoint viral loads following viral rebound, defined as 
day 56-112 after ART discontinuation compared with untreated monkeys. Red 
bars indicate median setpoint viral loads. P values reflect one-sided t-tests. 
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and Extended Data Table 2). Initiation of ART on day 3 as compared 
with days 7, 10 and 14 resulted in lower modelled residual viral loads at 
the time of ART discontinuation (P = 0.01) and a trend towards a greater 
viral growth rate Ro during viral rebound (P = 0.06), but no difference in 
post-rebound setpoint viral loads (Extended Data Fig. 7). Average Ro 
during viral rebound was 4.2 + 1.8 in the day 3 treated animals as com- 
pared with 2.3 + 0.6 in the day 14 treated animals (P = 0.05; Extended 
Data Fig. 7), presumably reflecting the partially effective SIV-specific immune 
responses in the latter group (Fig. 2). Total plasma viraemia during acute 
infection was interpolated and calculated as the area under the curve for 
pre-ART viral loads (AUC VL) (Extended Data Fig. 8). Consistent with 
recent findings in acute HIV-1 infection in humans"®, the AUC VL during 
acute infection correlated with levels of proviral DNA in PBMCs (P < 0.0001; 
Fig. 5a), LNMCs (P = 0.005; Fig. 5b) and GMMCs (P = 0.04; data not 
shown) at the time of ART discontinuation. Moreover, both the AUC 
VL (P < 0.0001; Fig. 5c) and proviral DNA in PBMCsat the time of ART 
discontinuation (P = 0.003; Fig. 5d) correlated inversely with the inter- 
polated time to viral rebound. These data suggest that total plasma virae- 
mia during acute infection and proviral DNA immediately before ART 
discontinuation may predict the time to viral recrudescence. 

We show that the viral reservoir can be seeded substantially earlier 
than previously recognized. After intrarectal SIV infection of rhesus mon- 
keys, the viral reservoir was seeded during the first few days of infection, 
during the eclipse phase, and before detectable viraemia, probably in 
the mucosal and lymphoid tissues that represent the first sites of viral 
replication'’. Consistent with this finding, we observed proviral DNA 
in lymph nodes and in gastrointestinal mucosa but not in PBMCs in mon- 
keys treated on day 3 after infection (Fig. 3a and Extended Data Fig. 5). 
The observation that the viral reservoir can be seeded before detectable 
viraemia suggests that substantial pathogenesis occurs in tissues in the 
first few days after mucosal virus exposure and prior to virus replica- 
tion in peripheral blood, which has important implications for HIV-1 
therapeutic and eradication strategies. 

Our data are concordant with recent clinical studies that have demon- 
strated that early ART can reduce the size of the viral reservoir and delay 
or reduce viral rebound after ART discontinuation in humans’***, Our 
findings similarly show that early ART decreased proviral DNA in blood 
and tissues (Fig. 3f-h) and delayed and reduced viral rebound (Fig. 4b, c) 
after ART discontinuation in SIV-infected rhesus monkeys. Moreover, 
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Figure 5 | Viral dynamics and correlations. a, b, Correlations of AUC VL 
with proviral DNA in PBMCs (a) and LNMCs (b) at the time of ART 
discontinuation are shown. c, d, Correlations of AUC VL (c) and proviral DNA 
in PBMCs before ART discontinuation (d) with the interpolated time to viral 
rebound are also shown. R? and P values were calculated from correlation 
analyses, and trend lines were calculated using standardized major axis 
regression. Animals with undetectable proviral DNA were plotted at the 
detection limit of 3 DNA copies (0.48 log) per 10° cells. 
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our observations extend previous studies that have shown effective post- 
exposure prophylaxis with short courses of ART when initiated 24h 
after SIV infection in monkeys”***. However, in the present study, ini- 
tiation of suppressive ART even as early as day 3 after infection failed 
to eliminate the viral reservoir and did not prevent viral rebound despite 
24 weeks of fully suppressive ART. In addition, our data (Fig. 4a) are 
consistent with clinical studies that have shown that the vast majority 
of HIV-1-infected individuals who initiate ART during acute infection 
show viral rebound after discontinuation of ART?*””. 

Our findings contrast with the sustained remission and potential cure 
ofa viraemic baby that was treated with ART at 30 h oflife”®. It is possible 
that this baby was inoculated parenterally with maternal cells instead of 
mucosally with virus, resulting in rapid viraemia without a previraemic 
eclipse phase of viral replication in mucosal and lymphoid tissues. The 
positive outcome in this baby therefore might have reflected the route of 
transmission, the lack of an eclipse phase, the very rapid initiation of ART, 
and/or the paucity of memory CD4" T lymphocytes in the neonatal 
immune system”. 

Clinical studies are required to confirm our observations, since import- 
ant differences exist between SIV infection of rhesus monkeys and HIV-1 
infection of humans. For example, the SIV dose used in the present study 
in monkeys was selected to limit the number of transmitted/founder 
viruses but also to infect all animals* and thus was substantially higher 
than typical HIV-1 doses in humans. Nevertheless, the higher challenge 
dose has been shown to shorten the eclipse period and to lead to earlier 
plasma viraemia*. Additional differences may also exist between SIV- 
infected rhesus monkeys and HIV-1-infected humans. 

The strikingly early seeding of the viral reservoir within the first few 
days of infection is sobering and presents new challenges to HIV-1 erad- 
ication strategies. If HIV-1 similarly seeds a persistent viral reservoir in 
mucosal and lymphoid tissues during the eclipse phase of infection and 
before viraemia after sexual exposure in humans, then it will be very dif- 
ficult to initiate ART before reservoir seeding, since viraemia is typically 
used for the clinical diagnosis of acute HIV-1 infection. Taken together, 
our data suggest that extremely early initiation of ART, extended ART 
duration, and probably additional interventions that target the viral 
reservoir will be required for HIV-1 eradication. Moreover, an improved 
understanding of the virological parameters that predict viral rebound 
after ART discontinuation will help guide future HIV-1 eradication efforts. 
Note added in proof: Since the acceptance of this paper, the ‘Mississippi 
baby’ that was treated with ART at 30 h oflife and hada prolonged remis- 
sion off therapy has now developed detectable levels of HIV-1 replica- 
tion and has restarted therapy, demonstrating that early ART did not 
eradicate the viral reservoir in this child. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 13 May; accepted 19 June 2014. 
Published online 20 July 2014. 


1. Finzi, D. et al. Latent infection of CD4* T cells provides a mechanism for lifelong 
persistence of HIV-1, even in patients on effective combination therapy. Nature 
Med. 5, 512-517 (1999). 

2. Zhang, L. etal. Quantifying residual HIV-1 replication in patients receiving 
combination antiretroviral therapy. N. Engl. J. Med. 340, 1605-1613 (1999). 

3. Chun, T. W., Davey, R. T., Jr, Engel, D., Lane, H. C. & Fauci, A. S. Re-emergence of HIV 
after stopping therapy. Nature 401, 874-875 (1999). 

4. Chun, T. W. etal. Quantification of latent tissue reservoirs and total body viral load 
in HIV-1 infection. Nature 387, 183-188 (1997). 

5. Chun, T. W. et al. Presence of an inducible HIV-1 latent reservoir during highly 
active antiretroviral therapy. Proc. Natl Acad. Sci. USA 94, 13193-13197 (1997). 

6. Persaud, D., Zhou, Y., Siliciano, J. M. & Siliciano, R. F. Latency in human 
immunodeficiency virus type 1 infection: no easy answers. J. Virol. 77, 1659-1665 
(2003). 

7. Ho,Y.C. etal. Replication-competent noninduced proviruses in the latent reservoir 
increase barrier to HIV-1 cure. Cel/ 155, 540-551 (2013). 

8. Liu, J. et al. Low-dose mucosal simian immunodeficiency virus infection restricts 
early replication kinetics and transmitted virus variants in rhesus monkeys. J. Virol. 
84, 10406-10412 (2010). 


LETTER 


9. Barouch, D. H. et al. Vaccine protection against acquisition of neutralization- 
resistant SIV challenges in rhesus monkeys. Nature 482, 89-93 (2012). 

10. Liu, J. etal. Immune control of an SIV challenge by a T-cell-based vaccine in rhesus 
monkeys. Nature 457, 87-91 (2009). 

11. Cline, A.N., Bess, J. W., Piatak, M., Jr & Lifson, J. D. Highly sensitive SIV plasma viral 
load assay: practical considerations, realistic performance expectations, and 
application to reverse engineering of vaccines for AIDS. J. Med. Primatol. 34, 
303-312 (2005). 

12. Nowak, M.A. et a/. Viral dynamics of primary viremia and antiretroviral therapy in 
simian immunodeficiency virus infection. J. Virol. 71, 7518-7525 (1997). 

13. Palmer, S. et a/. New real-time reverse transcriptase-initiated PCR assay with 
single-copy sensitivity for human immunodeficiency virus type 1 RNA in plasma. 
J. Clin. Microbiol. 41, 4531-4536 (2003). 

14. Whitney, J. B. et a/. T-cell vaccination reduces simian immunodeficiency virus 
levels in semen. J. Virol. 83, 10840-10843 (2009). 

15. Rosenbloom, D.1., Hill, A. L., Rabi, S.A, Siliciano, R. F. & Nowak, M. A. Antiretroviral 
dynamics determines HIV evolution and predicts therapy outcome. Nature Med. 
18, 1378-1385 (2012). 

16. Archin, N. M. etal. Immediate antiviral therapy appears to restrict resting CD4* cell 
HIV-1 infection without accelerating the decay of latent infection. Proc. Natl Acad. 
Sci. USA 109, 9523-9528 (2012). 

17. Haase, A. T. Targeting early infection to prevent HIV-1 mucosal transmission. 
Nature 464, 217-223 (2010). 

18. Ananworanich, J. et al. Impact of multi-targeted antiretroviral treatment on gut T 
cell depletion and HIV reservoir seeding during acute HIV infection. PLoS ONE 7, 
€33948 (2012). 

19. von Wyl, V. et al. Early antiretroviral therapy during primary HIV-1 infection results 
in a transient reduction of the viral setpoint upon treatment interruption. PLoS 
ONE 6, €27463 (2011). 

20. Hocqueloux, L. eta/. Long-term antiretroviral therapy initiated during primary HIV- 
1 infection is key to achieving both low HIV reservoirs and normal T cell counts. 
J. Antimicrob. Chemother. 68, 1169-1178 (2013). 

21. Saez-Ciridn, A. et al. Post-treatment HIV-1 controllers with a long-term virological 
remission after the interruption of early initiated antiretroviral therapy ANRS 
VISCONTI Study. PLoS Pathog. 9, e€1003211 (2013). 

22. Steingrover, R. et al. HIV-1 viral rebound dynamics after a single treatment 
interruption depends on time of initiation of highly active antiretroviral therapy. 
AIDS 22, 1583-1588 (2008). 

23. Tsai,C.C. etal. Effectiveness of postinoculation (R)-9-(2-phosphonylmethoxypropyl) 
adenine treatment for prevention of persistent simian immunodeficiency virus 
SlVmne infection depends critically on timing of initiation and duration of 
treatment. J. Virol. 72, 4265-4273 (1998). 

24. Tsai, C.C. et al. Prevention of SIV infection in macaques by (R)-9-(2- 
phosphonylmethoxypropyl)adenine. Science 270, 1197-1199 (1995). 

25. Saez-Cirion, A. et a. Post-treatment HIV-1 controllers with a long-term virological 
remission after the interruption of early initiated antiretroviral therapy ANRS 
VISCONTI Study. PLoS Pathog. 9, e1003211 (2013). 

26. Stohr, W. et al. Duration of HIV-1 viral suppression on cessation of antiretroviral 
therapy in primary infection correlates with time on therapy. PLoS ONE 8, e78287 
(2013). 

27. Rosenberg, E. S. et al. Safety and immunogenicity of therapeutic DNA vaccination 
in individuals treated with antiretroviral therapy during acute/early HIV-1 
infection. PLoS ONE 5, e€10555 (2010). 

28. Persaud, D. et a/. Absence of detectable HIV-1 viremia after treatment cessation in 
an infant. N. Engl. J. Med. 369, 1828-1835 (2013). 

29. Liu,J.,Li,H., lampietro, M.J.& Barouch, D. H. Accelerated heterologous adenovirus 
prime-boost SIV vaccine in neonatal rhesus monkeys. J. Virol. 86, 7829-7835 (2012). 


Acknowledgements We thank M. Pensiero, M. Marovich, C. Dieffenbach, W. Wagner, 
C. Gittens, J. Yalley-Ogunro, M. Nowak, R. Siliciano, D. Persaud, S. Deeks, N. Chomont, 
J. Ananworanich, L. Picker, F. Stephens, R. Hamel, K. Kelly and L. Dunne for advice, 
assistance and reagents. The SlVmace39 peptides were obtained from the National 
Institutes of Health (NIH) AIDS Research and Reference Reagent Program. We 
acknowledge support from the US Army Medical Research and Material Command and 
the US Military HIV Research Program through its cooperative agreement with the 
Henry M. Jackson Foundation for the Advancement of Military Medicine 
(W81XWH-07-2-0067, W81XWH-11-2-0174); the NIH (Al060354, Al078526, 
Al084794, Al095985, Al096040, Al100645); and the Ragon Institute of MGH, MIT and 
Harvard. The views expressed in this manuscript are those of the authors and do not 
represent the official views of the Department of the Army or the Department of Defense. 


Author Contributions J.B.W., R.G., M.L.R., J.H.K., N.L.M. and D.H.B. designed the studies 
and interpreted the data. J.B.W. and S.S. led the virological assays. P.P.-M.,J.L.,M.S., LP., 
C.C., J.S., S.B., JY.S., ALB. LEP., E.N.B. and K.M.S. led the study operations and the 
immunological assays. A.L.H. and D.I.S.R. led the mathematical modelling and 
statistical analysis. M.G.L. led the clinical care of the rhesus monkeys. B.L, J.Ha., J.Hi. 
and R.G. developed the antiretroviral drug cocktail. J.B.W. and D.H.B. wrote the paper 
with all co-authors. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

D.H.B. (dbarouch@bidmce.harvard.edu). 


7 AUGUST 2014 | VOL 512 | NATURE | 77 


©2014 Macmillan Publishers Limited. All rights reserved 


1 sid al Be 


doi:10.1038/nature13383 


Neuropathy of haematopoietic stem cell niche is 
essential for myeloproliferative neoplasms 


Lorena Arranz', Abel Sanchez -Aguilera’, Daniel Martin-Pérez!, Joan Isern’, Xavier Langa’, Alexandar Tzankov’, 


Pontus Lundberg’, Sandra Muntion®, Yi-Shiuan Tzeng*, Dar-Ming Lai* 


& Simon Méndez-Ferrer! 


Myeloproliferative neoplasms (MPNs) are diseases caused by muta- 
tions in the haematopoietic stem cell (HSC) compartment. Most MPN 
patients have a common acquired mutation of Janus kinase 2 (JAK2) 
gene in HSCs" “* that renders this kinase constitutively active, leading 
to uncontrolled cell expansion. The bone marrow microenvironment 
might contribute to the clinical outcomes of this common event. We 
previously showed that bone marrow nestin* mesenchymal stem cells 
(MSCs) innervated by sympathetic nerve fibres regulate normal HSCs**. 
Here we demonstrate that abrogation of this regulatory circuit is essential 
for MPN pathogenesis. Sympathetic nerve fibres, supporting Schwann 
cells and nestin* MSCs are consistently reduced in the bone marrow of 
MPN patients and mice expressing the human JAK2(V617F) mutation 
in HSCs. Unexpectedly, MSC reduction is not due to differentiation but 
is caused by bone marrow neural damage and Schwann cell death trig- 
gered by interleukin-1f produced by mutant HSCs. In turn, in vivo 
depletion of nestin* cells or their production of CXCL12 expanded 
mutant HSC number and accelerated MPN progression. In contrast, 
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administration of neuroprotective or sympathomimetic drugs pre- 
vented mutant HSC expansion. Treatment with B;-adrenergic agonists 
that restored the sympathetic regulation of nestint MSCs** prevented 
the loss of these cells and blocked MPN progression by indirectly 
reducing the number of leukaemic stem cells. Our results demon- 
strate that mutant-HSC-driven niche damage critically contributes 
to disease manifestation in MPN and identify niche-forming MSCs 
and their neural regulation as promising therapeutic targets. 

The stem cell niche has recently emerged as an oncogenic unit and an 
important element in regulating cancer stem cells, including HSCs’"’. 
Most MPN patients who do not carry the BCR-ABL fusion have an acquired 
mutation in Janus kinase 2 (JAK2(V617F)) in HSCs that results in con- 
stitutive kinase activity, leading to uncontrolled expansion of HSCs and 
erythroid, megakaryocytic and myeloid progenitors’ *. Somatic mutations 
in thrombopoietin receptor'*”* or calreticulin’*”” genes are found in some 
MPN patients and additional HSC mutations also affect disease progres- 
sion'*””, Changes in the HSC microenvironment might also contribute to 
MPN development, and expansion of bone marrow (BM) fibroblasts and 
bone-forming cells suggests the participation of MSCs. 

We previously reported that mouse BM nestin* MSCs are required 
to maintain HSCs’ and that human BM nestin* cells can expand HSCs". 
Here we found that, despite elevated BM blood-vessel density in MPN 
patients, nestin* cell number and nestin messenger RNA expression were 
markedly reduced (Fig. 1a, b). This was reproduced in Mx1-cre;JAK2(V617F) 


Figure 1 | Apoptosis of BM nestin* HSC niche cells contributes to MPN 
progression. a, BM sections of controls (left) and MPN patients (right) 
immunostained with nestin (brown, all panels) and CD34 (red, lower panels; 
magnification X200). Numbers of nestin® niches per mm” (mean + s.d.) 
were 1.15 + 0.3 (control) and 0.17 + 0.18 (MPN; n = 40; P= 10-°, Mann- 
Whitney U test). b, c, Nestin mRNA expression in BM cells from 

controls (n = 2), MPN patients (n = 11) and mice (n = 4). d, Nes-GEP* 

cells in skull BM of control (left) and MPN mice (right; n = 10). 

e-g, CD45 CD31 Terll9 Nes-GFP* cells (e), clonal self-renewing 
mesenchymal sphere (mesensphere)-forming cells (f) and fibroblastic colony- 
forming units (CFU-F, g) in BM cells from wild-type mice 30 weeks after 
transplantation with control (n = 7) or MPN BM cells (n = 12). h, i, Lineage- 
tracing studies of nestin* cells. Femoral sections of tamoxifen-treated 
Nes-cre”®"?;RCE: loxP mice 28 weeks after transplantation with control (h) or 
MPN BM cells (i), showing fluorescent signals from GFP and nuclei 
counterstained with DAPI (n = 4). j, Fraction of live, early and late apoptotic 
BM stromal Nes-GFP* cells from control (n = 7) or MPN mice (n = 5). 
k-m, Blood counts (k), femoral trichromic (1) and spleen haematoxylin and 
eosin stainings (m) of Nes-cre’®?;iDTA and control mice 20 weeks after 
transplantation of MPN BM cells (n = 3). n, Frequency of lin” sca-1* c-kit* 
(LSK) haematopoietic progenitors in BM nucleated cells (BMNC) of 
Nes-cre’®?;Cxcl12"' (n = 5) and control littermates (n = 7) 30 weeks after 
transplantation with MPN BM cells and 24 weeks after tamoxifen 

treatment. c-e, j, 6-8 weeks after pIpC treatment. Scale bars (d, h), 200 pm. 
Magnification (1, m) X100. Mean + s.e.m. *P < 0.05; **P < 0.01 (unpaired 
two-tailed t test). BM, bone marrow. C, control (disease-free) mice. 
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Figure 2 | BM Schwann cell death triggered by HSPC-derived interleukin- 
1B precedes nestin* MSC apoptosis in MPN. a, Principal component analysis 
of control and MPN BM CD45" CD31 Terl119” Nes-GFP* cells compared with 
mesenchymal stromal (adult BM Pdgfra.*Scal* and neonatal BM Pdgfra* Nes- 
GFP*"~ cells; purple shaded area) and Schwann cells (embryonic day 12.5 
(E12.5) Schwann cell precursors and neonatal Schwann cells; green shaded area; 
see Methods). b, qPCR validation (n = 3). c, Nes-gfp skull BM 5 weeks after 
transplantation with control or MPN BM cells; fluorescent signals of GFP (green) 
and sympathetic nerve fibres detected with anti-tyrosine hydroxylase (Th) 
antibodies (red). d, Immunostaining of glial fibrillary acidic protein (Gfap) to 
visualize Schwann cells in control and MPN BM. Scale bar, 200 lm (c), 100 pm 
(d). e, Quantification of Th* fibres in BM from controls (n = 2) and MPN 
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patients (n = 16). f, GFAP mRNA expression in BM cells of controls (n = 2), 
MPN patients (1 = 11) and mice (n = 4). g, Time-course analyses of Schwann 
cells (Gfap), sympathetic fibres (Th) and Nes-GFP* cell apoptosis in Nes- 
ofp;Mx1-cre,JAK2(V617F)* and control mice 2, 4 (n = 5) and 8 (n = 7) weeks 
after pIpC treatment. h, Frequency of BM CD45 CD31" Ter119 CD105" cells 
after 18-week interleukin-1 receptor antagonist (ILlra) treatment (n = 5). 

i, Apoptotic rate of MSCs and Schwann cells in vitro derived from neonatal BM 
Nes-GEP* cells and co-cultured for 24h with control or MPN adult BM lin” sca- 
1*c-kit* (LSK) cells (+ 200 ng ml’ IL1ra) (n = 3 experiments). TUNEL 
staining (pink) of Schwann cells; nuclei were counterstained with DAPI (blue). 
Scale bar, 100 um. Mean + s.e.m. *P < 0.05; **P < 0.01; ***P < 0.001 
(unpaired two-tailed t test). HSPC, haematopoietic stem and progenitor cells. 


mice that developed MPN upon pIpC treatment’’” (Fig. 1c). MPN mice 
were intercrossed with a Nes-gfp line to label MSCs° with green fluorescent 
protein (GFP). Both compound mutant mice and Nes-gfp animals trans- 
planted with mutant BM cells developed MPN and had fewer BM MSCs 
defined by GFP, surface marker expression and functional assays 
(Fig. 1d-g and Extended Data Fig. 1a—f). Because MSC loss was con- 
comitant with incipient BM fibrosis, we conducted long-term in vivo 
lineage-tracing studies to determine whether nestin” MSCs differenti- 
ate into fibroblasts or osteoblasts in MPN, thereby contributing to the 
stromal changes in these mice’’’? (Extended Data Fig. 1g-j). Unexpect- 
edly, the vascular patterns of GEP* cells were similar to those in unaffected 


Figure 3 | Treatment with B3-adrenergic agonist or neuroprotective drug 
blocks MPN progression. a, b, Neurotrophic rescue of BM Schwann cells 
blocks MPN progression. Wild-type mice transplanted with MPN BM cells were 
treated over a month with BRL37344, the neuroprotective drug 4-methylcatechol 
(4MC) or vehicle. a, Skull BM immunostaining of glial fibrillary acidic protein 
(red) to visualize Schwann cells (n = 5); scale bar, 100 pm. b, Circulating 
neutrophils (control, JAK2(V617F) 4MC, JAK2(V617F) BRL37344, n = 5; 
JAK2(V617F) vehicle, n = 9). ¢, Circulating erythrocytes, neutrophils and 
platelets 16 weeks after transplantation of MPN BM cells into f3-adrenergic 
receptor-deficient (KO, n = 8) and wild-type (WT) mice (n = 7). d, Van Gieson 
stainings of femoral BM of mice in c. e-g, Compensation of BM sympathetic 
damage by selective sympathomimetic drugs blocks MPN progression and 
prevents fibrosis. e, Blood counts of WT mice transplanted with MPN BM cells 
and chronically treated with selective B-adrenergic agonist (BRL37344) or 
vehicle (n = 5). f, Reticulin and Van Gieson stainings of femoral BM of mice in 
e (magnification, X200). g, In vivo depletion of nestin* cells reduces the 
therapeutic effect of BRL37344. Blood counts of iDTA control (n = 5) and Nes- 
cre“®!?.iDTA mice treated with vehicle (n = 4) or BRL37344 (n = 5) for 6 weeks. 
Drug treatments in a, b, e, f started 4 weeks after transplantation; treatment in 
g started 2 weeks after transplantation combined with tamoxifen (4 weeks). 
Mean + s.e.m. *P < 0.05, **P < 0.01, ***P < 0.001 (unpaired two-tailed t test). 
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Nes-gfp mice (Fig. 1h, i). Thus, as was recently reported in BCR-ABL* 
MPN”, Nes-GEP cells might generate excessive fibroblasts and oste- 
oblasts. Nestin* MSC reduction was instead explained by a threefold 
increased apoptotic rate in mutant mice (Fig. 1j), and was not prevented 
by the JAK inhibitor ruxolitinib (Extended Data Fig. 2). 

To determine whether nestin® MSC death could in turn stimulate MPN 
progression, we selectively depleted nestin” cells in vivo. This depletion 
did not affect mature BM Schwann cells, reported to express nestin”’, 
but reduced MSCs, associated with increased white and red blood cells 
(Fig. 1k and Extended Data Fig. 3a—e). BM sections revealed excessive 
fibroblasts and bone formation, not yet detectable in control mice (Fig. 11), 
consistent with nestin” cells not generating fibroblasts or bone cells in 
MPN. Disease acceleration following nestin” cell depletion also manifested 
as severe spleen infiltration, still absent in control mice (Fig. 1m). 

Atan early disease stage, most primitive HSCs showed highest expan- 
sion, leading to increased haematopoietic progenitors in BM, peripheral 
blood and spleen. The chemokine Cxcl12 regulates HSC migration and 
quiescence” and is highly expressed by nestin® MSCs°. Early HSC 
mobilisation correlated with decreased BM Cxcl12, consistent with MSC 
reduction. In addition, Cxcl12 expression dropped 70-fold in MPN BM 
Nes-GFP* cells (Extended Data Fig. 3f-k). Deletion of Cxcl12 (ref. 24) in 
nestin™ cells in vivo increased BM haematopoietic progenitors and cir- 
culating platelets (Fig. ln and Extended Data Fig. 31). MSC-derived 
Cxcl12 can thus negatively regulate JAK2(V617F)" HSC expansion. 

To better understand BM nestin™ cell alterations, genome-wide express- 
ion was profiled by next-generation sequencing. Expression of MSC and 
HSC-related genes was lower in MPN Nes-GFP* cells, which instead showed 
enrichment in Schwann cell genes and neural-related functional categor- 
ies (Extended Data Figs 4a-f and 10). Principal component analyses of 
independent biological samples compared with publicly available data 
showed that control Nes-GFP* cells were closest to mesenchymal pro- 
genitors, whereas MPN Nes-GFP * clustered away from them and close to 
Schwann cells (Fig. 2a). These changes, confirmed by quantitative PCR 
(Fig. 2b), suggested an altered HSC niche’s neural component in MPN. 

Sympathetic nerve fibres and ensheathing Schwann cells, adjacent to 
distinctive Nes-GFP* cells, and GFAP MRNA expression were markedly 
reduced in BM of MPN patients and mice (Fig. 2c-f and Extended Data 
Fig. 4g-i). Time-course analysis showed that BM neural damage precedes 
Nes-GFP* cell apoptosis (Fig. 2g), indicating that sympathetic neuro- 
pathy could sensitise nestin * cells to cell death triggered by mutant cells. 
Multiplex ELISA detected early increased interleukin-1B in MPN BM 
(Extended Data Fig. 5a). This cytokine and its activating enzyme caspase-1 
were expressed by monocytes, as previously reported”, but also by 
haematopoietic progenitors (Extended Data Fig. 5b-d). Compared with 
BM Nes-GFP stromal cells, mRNA levels of interleukin-1 receptor and 
its antagonist were 10- and 1,000-fold higher in Nes-GEP * cells, respectively, 
and specifically upregulated in Nes-GFP™ cells in MPN (Extended Data 
Fig. 5e, f). Therefore we chronically treated mice with an antagonist of 
interleukin-1 receptor. This treatment reduced platelet counts and increased 
BM MSC frequency, associated with reduced caspase-1 mRNA expression 
in haematopoietic progenitors (Fig. 2h and Extended Data Fig. 5g-j). 
We studied whether JAK2(V617F)* HSCs might directly cause BM 
Schwann cell death. Unlike MSCs, BM-derived Schwann cells co-cultured 
with JAK2(V617F)* haematopoietic progenitors showed threefold higher 
apoptotic rate, which was blocked by interleukin-1 receptor antagonist 
(Fig. 2i). Together, these data suggest that HSC-derived interleukin-1B 
contributes to neuroglial damage, which compromises MSC survival. We 
therefore investigated whether sympathetic neuropathy might underlie 
HSC niche alterations and thus represent a therapeutic target in MPN. 

We treated symptomatic MPN mice with the neuroprotective agent 
4-methylcatechol, which can protect BM sympathetic nerve fibres dur- 
ing chemotherapy~*. Schwann cells were preserved in 4-methylcatechol- 
treated mice, associated with prevented neutrophilia (Fig. 3a, b). Sympathetic 
nerve fibres regulate BM HSC traffic via B;-adrenergic receptor activation 
innestin* MSCs**. This receptor is not expressed in normal or JAK2(V617F)* 
haematopoietic cells (Extended Data Fig. 6a). Disease development was 
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mutant HSC expansion in MPN. a-d, Efficacy of BRL37344 treatment in 
advanced MPN; WT mice transplanted with control or MPN BM cells were 
treated with BRL37344 or vehicle (veh) upon thrombocytosis (869 + 23 X 10° 
and 1,968 + 264 x 10° platelets per ml of blood, respectively; n = 5). a, Blood 
counts; P< 0.05, *fP< 0.01, t*+P < 0.001 vs vehicle-treated control mice. 
b, IL-1 content in BM supernatant. c, Van Gieson staining of femoral BM 
sections (magnification, x 100). d, Quantification of BM glial fibrillary acidic 
protein (Gfap) immunostaining. e-l, Sympathomimetic drug restores BM 
Cxcl12 levels and prevents HSC expansion and mobilisation. WT mice 
transplanted with control or MPN BM cells were treated 4 weeks with 
BRL37344, 4-methylcatechol or vehicle over 4 (e, h), 8 (f) or 16 weeks 

(g, i, j). e, IL-1B content in BM supernatant (control vehicle, n = 10; 
JAK2(V617F) vehicle, n = 11; JAK2(V617F) 4-methylcatechol, JAK2(V617F) 
BRL37344, n = 5). f, BM CD45 CD31 Terl19 Nes-GFP* cells in vehicle 

(n = 8) and BRL-treated (n = 10) MPN mice. g, Cxcl12 content in BM 
supernatant (n = 5). h-j, Frequency of colony-forming units (CFU-C) in BM 
nucleated cells (BMNC) (h, 1 = 5; j, n = 3) and blood (i), and BM fraction of 
lin” sca-1*c-kit* (LSK) cells (n = 3) (j). k, LSK cells, long-term (LT-) and 
short-term (ST-) HSCs (n = 5) in BM from mice in a-d. 1, Reduction of 
leukaemic stem cells in BRL37344-treated mice. CD45.2 WT mice were 
transplanted with BM cells from CD45.1 mice together with BM cells from 
MPN mice treated with vehicle or BRL37344 (n = 5). Frequency of mice with 
<50% donor LSK cell chimaerism 16 weeks after transplantation is plotted 
against tested cell number. MPN-initiating cell frequency is indicated. 

*P < 0.05, Pearson chi-squared t test. m, Model illustrating HSC niche 
alterations and rescue in MPN. MPN, myeloproliferative neoplasm; HSC, 
haematopoietic stem cell; SNS, sympathetic nervous system; MSC, 
mesenchymal stem cell; NA, noradrenaline; AR, adrenergic receptor; C, control 
(disease-free mice). a-k, Mean + s.e.m. *P < 0.05, **P< 0.01, ***P< 0.001 
(unpaired two-tailed f test). 
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accelerated in mice lacking the B3-adrenergic receptor (Fig. 3c, d), unco- 
vering a protective role of this receptor in MPN. Symptomatic mice were 
chronically treated with a selective B,-adrenergic agonist (BRL37344) to 
compensate for deficient sympathetic stimulation of nestin’ MSCs. 
Notably, BRL37344 treatment prevented MPN-associated neutrophilia 
and thrombocytosis, and delayed red blood cell reduction, but did not 
affect blood counts in wild-type mice (Fig. 3b, e and Extended Data 
Fig. 6b-d). Contrasting with the severe fibrosis in vehicle-injected 
mice, the BM of BRL37344-treated animals was virtually devoid of excess- 
ive bone and fibroblastic tissue (Fig. 3f). These effects were HSC niche- 
dependent, because neither 4-methylcatechol nor BRL37344 affected 
the growth of cultured haematopoietic progenitors and leukocytosis 
was not rescued by BRL37344 in mice with nestin* cell depletion (Fig. 3g 
and Extended Data Fig. 7a). Similarly, several MPN markers were improved 
by treatment with the clinically-approved B3-adrenergic agonist mir- 
abegron (Extended Data Fig. 7b), albeit to a lower extent, probably 
owing to its poor solubility and relatively low affinity for the murine 
receptor. To investigate the potential therapeutic benefit when admi- 
nistered at more advanced stages, thrombocytotic and control mice 
were treated with BRL37344. This treatment reduced neutrophilia, 
erythrocytosis, thrombocytosis, BM interleukin- 1, fibrosis and osteo- 
sclerosis, rescued BM Schwann cells (Fig. 4a—d) and blocked Schwann 
cell program activation in BM nestin™ cells (Extended Data Fig. 7c). 
MPN progression can thus be blocked by protection or rescue of BM 
neuroglia and by compensation of deficient sympathetic stimulation of 
nestin’ MSCs by B3-adrenergic agonists. 

We next asked whether MPN blockade could be mediated by pre- 
servation of MSCs and their HSC regulatory function. BRL37344 reduced 
IL-1, restored Nes- GFP* cell number and increased Cxcl12 levels in BM 
(Fig. 4e-g). Early BRL37344 or 4-methylcatechol treatment prevented 
mutant haematopoietic progenitor expansion (Fig. 4h and Extended 
Data Fig. 8). Long-term BRL37344 treatment did not compromise nor- 
mal HSCs but efficiently decreased mutant haematopoietic progenitors 
(Fig. 4i, j and Extended Data Fig. 9), even when administered at throm- 
bocytosis stage (Fig. 4k). Moreover, BRL37344-treated MPN mice showed 
4.5-fold reduction in leukaemic stem cells (Fig. 41). 

Our findings point to mutant HSCs as the cause of BM neuroglial 
damage that compromises MSC survival and function, critically con- 
tributing to MPN pathogenesis (Fig. 4m). The study shows that the niche 
damage triggered by the mutant HSC is essential for the development of a 
haematopoietic malignancy previously considered to be caused by the 
HSC alone. Targeting HSC niche-forming MSCs and their neural regu- 
lation may pave the way to more efficient therapeutic strategies in MPN. 


METHODS SUMMARY 

Mx1-cre;JAK2(V617F)”, Nes-gfp”’, Ni es-creoRTREF28) | RCE-loxP”®, iDTA”®, Cxcl12- 
floxed’, Adrb3’"“°", CD45.1 and CD45.2 C57BL/6J mice (Jackson Laboratories) 
were housed in specific-pathogen-free facilities. Protocols were approved by the Ani- 
mal Welfare Ethical Committee. In vivo treatments, cell extraction, culture, FACS, 
histological analyses, ELISA, qPCR, genomic and statistical analyses are detailed in 
the Methods. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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‘Gain’ of supernumerary copies of the 8q24.21 chromosomal region 
has been shown to be common in many human cancers’ and is 
associated with poor prognosis”’®*. The well-characterized myelo- 
cytomatosis (MYC) oncogene resides in the 8q24.21 region and is 
consistently co-gained with an adjacent ‘gene desert’ of approximately 
2 megabases that contains the long non-coding RNA gene PVT1, the 
CCDC26 gene candidate and the GSDMC gene. Whether low copy- 
number gain of one or more of these genes drives neoplasia is not 
known. Here we use chromosome engineering in mice to show that 
a single extra copy of either the Myc gene or the region encompass- 
ing Pvt1, Ccdc26 and Gsdmc fails to advance cancer measurably, 
whereas a single supernumerary segment encompassing all four genes 
successfully promotes cancer. Gain of PVT1 long non-coding RNA 
expression was required for high MYC protein levels in 8q24-amplified 
human cancer cells. PVT1 RNA and MYC protein expression corre- 
lated in primary human tumours, and copy number of PVT1 was 
co-increased in more than 98% of MYC-copy-increase cancers. Abla- 
tion of PVT1 from MYC-driven colon cancer line HCT116 dimin- 
ished its tumorigenic potency. As MYC protein has been refractory 
to small-molecule inhibition, the dependence of high MYC protein 
levels on PVT1 long non-coding RNA provides a much needed ther- 
apeutic target. 

To determine whether low copy-number gain of MYC and/or other 
genetic element(s) in the 8q24.21 region (PVT1, CCDC26 and GSDMC; 
Fig. la and Extended Data Fig. 1) promotes cancer, chromosome 
engineering’* (Extended Data Fig. 2) was used to derive mice bearing 
single-copy gain of (1) Myc alone (Fig. 1b and Extended Data Fig. 2a), 
(2) Pvt1, Ccdc26 and Gsdmc (Fig. 1c and Extended Data Fig. 2b) and 
(3) the entire 2-Mb Myc,Pvt1,Ccdc26,Gsdmc syntenic region (Fig. 1d 
and Extended Data Fig. 2c). All three strains were viable and fertile and 
showed no overt developmental defect. In human breast and ovarian 
cancers, gain of 8q24 is often accompanied by ERBB2 amplification'’*””. 
Accordingly, each mutant line was bred to MMTVneu transgenic mice™* 
and mammary tumour latency was examined. gain(Myc),MMTVneu 
mice developed mammary tumours at essentially the same median latency 
(345 days) as MMTVneu mice (Fig. le), indicating that a single super- 
numerary Myc gene is insufficient to promote MMTVneu-driven cancer. 
Tumour latency of gain(Pvt1,Ccdc26,Gsdmc),MMTVneu mice was also 
indistinguishable from MMTVneu mice (316 days; Fig. le). In contrast, 
gain(Myc,Pvt1,Ccdc26,Gsdmc), MMT Vneu mice showed shorter mam- 
mary tumour latency (224 days; Fig. le) and increased penetrance (90%) 
compared with the other three genotypes (40-60%). Compared with 
age-matched MMTVneu adenomas, gain(Myc,Pvt1,Ccdc26,Gsdmc), 
MMTVneu tumours presented a high mitotic index involving locally 
invasive solid masses that invaded adjacent thin-walled blood vessels, 
consistent with adenocarcinomas (Extended Data Fig. 3a—c). Three out 
of ten gain(Myc,Pvt1,Cdc26,Gsdmc) mice, but none of gain(Myc), gain 


(Pvt1,Cdc26,Gsdmc) and wild-type mice (n = 10 for each genotype), 
developed metastasis in the absence of MMTVneu ina limited-lifespan 
study (500 days), indicating that gain(Myc,Pvt1,Cdc26,Gsdmc) contri- 
butes to spontaneous metastasis in older mice, albeit with low pene- 
trance (Extended Data Fig. 3d, e). These data demonstrate that gain ofa 
single copy of Myc cooperates with one or more genetic element(s) in 
the Pvt1/Ccdc26/Gsdmc region to promote mammary tumorigenesis. 

The mammary epithelium of 10-week-old gain(Myc,Pvt1,Ccdc26,Gsdmc) 
mice showed pre-cancerous properties in the absence of MMTVneu, 
including elevated levels of y-H2AX (a marker of DNA damage; Fig. 2a), 
p53 (a key mediator of cellular stress response) and phospho-ERK1/2 (a 
pro-survival signalling molecule; Extended Data Fig. 4a, b). Accordingly, 
gain(Myc,Pvt1,Ccdc26,Gsdmc) mammary epithelium showed increased 
DNA replication (Fig. 2b), reduced oestrogen receptor-« expression”” 
(Extended Data Fig. 4c) and 5.5 times more lateral ductal branching 
than wild type (Fig. 2c). gain(Myc) glands showed a more modest two- 
fold increase in branching compared with wild type. Furthermore, 
gain(Myc,Pvt1,Ccdc26,Gsdmc) mice showed aberrant differentiation, 
including a precocious alveolar-like phenotype and co-expression of 
keratin-14 (a myoepithelial marker) with keratin-8, in luminal epithe- 
lial cells (Extended Data Fig. 4d, e). Crossing gain(Myc,Pvt1,Ccdc26, 
Gsdmc) mice with mice carrying a deletion of precisely the same region 
resulted in mice with two copies of this region on one chromosome 15 
homologue and no copy on the other, which completely abolished aber- 
rant cellular proliferation (Fig. 2d) and excessive branching (Fig. 2e). 
Three copies of the Myc/Pvt1/Ccdc26/Gsdmc region, therefore, pro- 
duced pre-cancerous phenotypes. 

We next sought to identify genetic elements driving gain(Myc,Pvtl, 
Ccdc26,Gsdmc) neoplasia. Ccdc26 transcript has not been detected in 
mouse, and Gsdmc expression was low in mouse mammary tissue with 
no differences in Gsdmc2-4 transcript levels across genotypes (Extended 
Data Fig. 5). Consequently, we focused on Myc and Pvt1. Cells from 
two independent gain(Myc,Pvt1,Ccdc26, Gsdmc), MMT Vneu mammary 
tumours were transfected with short interfering RNAs (siRNAs) to 
knockdown Myc (siMyc), Pvt1 (siPvt1) or both, and grown in three- 
dimensional culture. siMyc had no effect on Pvtl RNA levels, and siPvt1 
had no effect on Myc messenger RNA (mRNA) (Fig. 3a). Depletion of 
either Myc or Pvtl RNA resulted in approximately 60% reduction in 
proliferation as measured by Ki67 staining (Fig. 3b). Surprisingly, simul- 
taneous knockdown of both Myc and Pvtl failed to reduce proliferation 
below individual knockdown of either gene (Fig. 3b), suggesting that 
Myc and Pvt1 promote proliferation through the same pathway. 

Analysis of Myc and Pvt1 expression showed that both Gain(Myc) and 
gain(Myc Pvt1,Ccdc26,Gsdmc) mammary glands showed approximately 
3.5-fold more Myc mRNA than wild type or gain(Pvt1,Ccdc26,Gsdmc) 
(Fig. 3c). Pvt] mRNA levels were approximately 1.5 times higher in gain 
(Pvt1,Ccdc26,Gsdmc) than wild type, as expected, but were surprisingly 
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tenfold elevated in gain(Myc,Pvt1,Ccdc26,Gsdmc) and fourfold increased 
in gain(Myc) mammary glands (Fig. 3c). Because MYC protein is a 
transcriptional activator of PVT1 (refs 20, 21), this result suggested 
higher levels of Myc protein and/or activity in gain(Myc,Pvt1,Ccdc26, 
Gsdmc). Threefold higher Myc protein levels were observed in gain 
(Myc,Pvt1,Ccdc26,Gsdmc) mammary gland compared with the other 
three strains (Fig. 3d). Thus, a third copy of Myc + Pvtl increased Pvt1 
transcription and Myc protein in mice. 

To extend these findings to high-8q24.21 copy cancers, we examined 
human breast-cancer cell lines SK-BR-3 and MDA-MB-231, which har- 
bour high copy gains’. siMYC and siPVT1 had no effect on PVT1 and 
MYC RNA, respectively (Fig. 3e), but reduced proliferation to similar 
extents with combined siMYC + siPVT1 failing to reduce proliferation 
further (Extended Data Fig. 6a, b). Furthermore, siPVT1 led to sup- 
pression of MYC protein levels (Fig. 3f), verifying that MYC protein 
levels are dependent upon PVT 1 mRNA in high-copy 8q24-gain cancer 
cells. Knockdown of individual microRNAs (miRNAs) encoded in the 
human PVT] introns had no detectable effect on proliferation (Extended 
Data Fig. 6c, d). 

The half-life of MYC protein is increased in SK-BR-3 and MDA- 
MB-231 cells compared with non-transformed breast epithelial cells”. 
To determine whether MYC protein stability is PVT 1-RNA dependent, 
SK-BR-3 and MDA-MB-231 cells were transfected with siPVT1 or con- 
trol siRNA and exposed to cycloheximide. PVT1 depletion resulted in 


more rapid loss of MYC protein than control cells (Fig. 3g and Extended 
Data Fig. 7). MYC protein degradation is promoted by phosphorylation 
at threonine 58 (ref. 23), and mice expressing Myc'**“ show enhanced 
mammary gland density, and mammary carcinoma™. Although siPVT1- 
mediated reduction of MYC protein levels was not accompanied by 
changes in levels of FBW7 and AXINI, which are key downstream com- 
ponents of MYC degradation, T58 phosphorylation was increased five- 
fold (Fig. 3h). Thus, MYC is protected from phosphorylation at the T58 
residue and subsequent degradation in a PVT1-RNA-dependent man- 
ner, raising the possibility of a MYC/PVT1 ribonucleoprotein com- 
plex. Simultaneous RNA fluorescence in situ hybridization (FISH) to 
detect PVT1 and immunofluorescence for MYC showed nuclear co- 
localization in 85.1% of SK-BR-3 cells (Fig. 3i and Extended Data Fig. 8). 
Furthermore, immunoprecipitation using anti- MYC followed by PCR 
with reverse transcription (RT-PCR) identified PVT1 RNA in the immu- 
noprecipitates (Fig. 3j), suggesting that PVT 1 and MYC physically inter- 
act directly or indirectly. Whether specific PVT1 isoform(s), CCDC26 
and GSDMC promote malignancy remains to be investigated. Toge- 
ther, these results indicate that gain of the Myc/MYC gene alone fails to 
increase MYC protein levels but that co-gain of the Pvt1/PVT1 gene 
disrupts Myc/MYC instability, resulting in increased protein levels and 
enhanced proliferation. 

If PVT1 copy increase is critical for high MYC protein levels, then 
co-gain of PVT1 should be mandatory in MYC-driven cancer. Analysis 
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Figure 2 | Pre-cancerous phenotypes in mouse gain(Myc,Pvt1,Ccdc26,Gsdmc) 
mammaty glands. a, b, Fluorescence images and quantification of y-H2AX 
foci (a) and BrdU-incorporation (b) in mammary ducts of indicated genotype 
(n = 3). ¢, Wholemount analysis of mammary glands (higher magnification, 
bottom). Inset, schematic of mammary gland. Branch points were enumerated 
at a 25-mm/” area near the lymph node (n = 3). d, e, Rescue of aberrant 
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Figure 3 | Pvt1/PVT1 co-gained with Myc/MYC elevates Myc/MYC protein 
levels. a, RT-qPCR measurement of Myc (left) and Pvt1 (right) RNA levels in 
gain(Myc Pvt1,Ccdc26,Gsdmc),MMTVneu/+ mammary tumour cells transfected 
with indicated siRNAs (n = 3). b, Proportions of primary tumour cells positive for 
Ki-67 after the indicated siRNA treatments (n = 3). ¢, RT-qPCR of Myc (left) and 
Pyvtl (right) transcript levels (n = 3), and d, western blot analysis (top) and 
quantification (bottom) of Myc protein in mammary tissue (n = 3). GAPDH, 
Glyceraldehyde 3-phosphate dehydrogenase. e—h, Analyses of human breast 
cancer cell line SK-BR-3. e, RT-qPCR measurement of MYC (left) and PVT1 
(right) transcripts in cells 48 h after transfection with the indicated siRNA(s) 

(n = 3). f, g, Western blot analysis of the MYC protein in SK-BR-3 after siRNA 
transfection (n = 3) (f), and siRNA transfection and cycloheximide (CHX) 
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proliferation (d), and enhanced lateral branching (e) in the gain(Myc,Pvtl, 
Ccdc26, Gsdmc) mammary ducts by corresponding loss allele. M, Myc; P, Pvt; 
C, Ccdc26; G, Gsdmc (n = 3). Results are shown as mean + s.e.m. (*P < 0.05, 
**P <0,01, ***P < 0.001, two-tailed Student’s t-test). Scale bar on 

a, b, d, 10 tum; ¢, e, 1 mm, 5 mm; error bars, s.e.m. 
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treatment for times indicated (n = 3) (g).h, Western blot analysis of MYC(p-T58), 
MYC(p-S62), MYC, FBW7 and AXIN1 protein levels in SK-BR-3 treated with 
siRNAs (left). Ratios of T58/total MYC and p-S62/total MYC (right) (n = 3) 

i, Immunofluorescence staining of MYC (green) and RNA FISH of PVT1 
(magenta) showing nuclear co-localization of MYC and PVT1 (white). 4’,6- 
Diamidino-2-phenylindole (DAPI) is shown in blue. The marked cell in the upper 
panel is shown in the lower panels in single channels and MYC + PVT1 overlay. 
j, RT-PCR using PVT1 and GAPDH specific primers of total SK-BR-3 RNA 
(input), immunoprecipitated using MYC antibody (IP MYC) and IgG (IP IgG). 
PVTI1-RT indicates samples not treated with reverse transcriptase. Results are 
shown as mean + s.e.m. (*P < 0.05, **P < 0.01, ***P < 0.001, two-tailed 
Student’s t-test). Scale bar, 10 um; error bars, s.e.m. 
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Figure 4 | PVT1 dependence in MYC-driven tumours. a, b, Proportion of all 
tumours harbouring gain of MYC but not PVTI (blue), PVT but not MYC 
(orange) and MYC + PVTI (green) in the Progenetix (left) and TCGA 
databases (right) (a) and among different cancer types in the TCGA database 
(b). c, Tissue microarray analysis showing nuclear expression of PVT1 (dark 
purple) and MYC (dark brown) in primary human tumours. Lower panels 
represent X10 magnification of regions shown by arrow in the upper panel. 
d, Cartoons showing that stabilized mutant B-catenin (B-cat) upregulates MYC 
transcription through the recruitment of T-cell factor (TCF) in human 
colorectal cancer line HCT116. e, Schematic of CRISPR-mediated excision 

of PVT1 to obtain the APVT] allele. DNA sequence of a PCR amplicon 
containing the junction sequence of the deletion product is shown. f, Images of 
colonies formed by PVT1+ and APVT1 HCT116 cells in soft agar assay (top). 


of 30,681 tumours from the Progenetix copy-number database showed 
that 18.8% (5,836 tumours) showed increased 8q24 copy number and 
5,763 out of 5,836 of these tumours (98.7%) had increased copy number 
of both MYC and PVT1 genes. Similarly, analysis of 15,241 tumours 
from The Cancer Genome Atlas (TCGA) database showed that 18.02% 
(2,821 tumours) showed 8q24 copy-number increase and that 2,746 
out of 2,821 tumours (97.34%) showed co-gain of both MYCand PVT1 
whereas fewer than 0.15% of tumours showed increased copy number 
of MYC but not PVT1 (Fig. 4a and Extended Data Fig. 9a). Sorting 
TCGA tumours by cancer type showed differences in the incidence of 
8q24 copy-number increase in individual cancer types ranging from 
MYC + PVT1 increase in almost half of ovarian and oesophageal car- 
cinomas to essentially no MYC and/or PVT increase among papillary 
thyroid tumours (Fig. 4b). Gain of MYC and PVTI was observed in 
62% of 483 HER2+ breast cancer samples (Extended Data Fig. 9b). 
Unlike mouse, GSDMC orthologues are expressed in human mammary 
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The insets are X3 magnification of the areas marked in each plate. 
Quantification of the respective colonies (bottom, n = 3). g, Tumour volume 
measurements from xenograft transplants of PVT1+ and APVT1 HCT116 
cells. Bioluminescent imaging at 3, 7, 10, 13 and 17 days after inoculation (left). 
APVT1 HCT116 inoculation at the left flank where the tumour failed to grow is 
designated by white circle (dashed). Mean tumour volumes are quantified 

(n = 6) (right). h, Western blot of MYC and GAPDH protein in three PVTI1+ 
and APVT1 HCT116 clones. Quantification of relative MYC protein levels in 
PVTI1+ and PVTIA HCT116 cells is shown (n = 3). i, Predicted outlook for 
an 8q24 cancer patient after inhibition of PVT1. Results are shown as 

mean = s.e.m. (*P < 0.05, **P < 0.01, ***P < 0.001, two-tailed Student’s 
t-test). Scale bar, 500 [um (c, top two rows of panels), 50 um (c, bottom two rows 
of panels); error bars, s.e.m. 


tissue (http://www.genecards.org/cgi-bin/carddisp.pl?gene=GSDMC). 
Although gain of MYC + PVTI + CCDC26 + GSDMC prevailed in low 
copy-number gain of 8q24 segments, co-gain of MYC + PVT] (but not 
CCDC26 or GSDMC) prevailed in high copy amplifications (Extended 
Data Fig. 9c, d). Co-gain of MYC and PVT1, therefore, dominated over 
gain of MYC alone across all cancer types showing 8q24 copy-number 
increases. 

To verify these findings, we examined tissue microarrays for PVT1 
RNA using in situ hybridization and MYC protein using immunohis- 
tochemistry on serial sections of human lung, colon, rectum, stomach, 
oesophagus, liver, kidney and breast tumours. Concurrent PVT1 RNA 
and MYC protein expressions were found in all eight cancers (Fig. 4c 
and Extended Data Fig. 9e, f), confirming that PVT1 RNA and the 
MYC protein are correlated in primary tumours. 

Finally, effects of PVT1 loss on MYC-driven tumours were assessed. 
using HCT116 human colorectal carcinoma cells, which harbour low 
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copy-number 8q24 gain and stable mutant B-catenin leading to MYC 
overexpression (Fig. 4d)*°. CRISPR-associated nuclease (Cas9) was used 
to excise precisely all copies of the PVT1 gene (Fig. 4e and Extended 
Data Fig. 10a, b) generating PVTI-null (APVT1) HCT116 cell lines 
(Extended Data Fig. 10c). APVT1 lines demonstrated reduced prolif- 
eration (Extended Data Fig. 10d) and impaired colony formation in 
soft agar compared with PVT1+ HCT116 cells (232.2 + 23.8 versus 
2,022.3 + 140.7 colonies; Fig. 4f). In xenograft studies, APVT1 HCT116 
cells either failed to form tumours (three out of six xenografts; Fig. 4g) or 
showed markedly reduced volume (three out of six xenografts; Extended 
Data Fig. 10e) compared with PVT1+ counterparts (Fig. 4g). Finally, 
MYC protein was significantly reduced (49.1% + 3.4) in APVTI HCT116 
clones compared with PVT 1+ HCT116 cells (Fig. 4h). PVT1 therefore 
regulates MYC protein level, and bestows tumorigenic potential to an 
MYC-driven non-breast cancer line. 

Targeting MYC directly with therapeutic interventions has proved 
challenging”***; thus regulation of high MYC protein levels by the PVT1 
long non-coding (IncRNA) bears considerable implications for thera- 
peutic treatment of MYC-driven cancers. Because MYC is an impor- 
tant transcription factor and an essential protein”, direct inhibition may 
have severe effects on patients. Our findings indicate that PVT1 IncRNA 
increases MYC protein levels in 8q24-gain cancers and that loss of 
PVT1 RNA reduces MYC protein to more normal levels. PVT 1, there- 
fore, may be a more accessible and less deleterious target than MYC 
itself for curtailing MYC-driven cancers (Fig. 4i). Future studies on the 
role of PVT1 on the MYC protein level in cancers without supernu- 
merary 8q24 and illuminating the molecular details of MYC/PVT1 
cooperation may lead to rational drug discovery specifically targeting 
the MYC/PVT1 axis in human cancers. 


METHODS SUMMARY 


Chromosome engineering on mouse AB2.2 embryonic stem cells was as previ- 
ously described”? (Extended Data Fig. 2). Mouse mammary glands were analysed 
as described”. Tissue microarray slides of human multiple-organ tumours (BC00119) 
were obtained from US Biomax. For copy-number analysis of TCGA tumours, data 
were derived from the Affymetrix Genome Wide Human SNP Array 6.0 platform 
from the open-access directory of TCGA (https://tcga-data.nci.nih.gov/tcgafiles/ 
ftp_auth/distro_ftpusers/anonymous/tumor/). CRISPR-mediated APVT1 HCT116 
cells were generated using piggyBac co-transposition enrichment. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 14 July 2013; accepted 8 April 2014. 
Published online 22 June; corrected online 6 August 2014 (see full-text HTML 
version for details). 


1. Huppi, K., Pitt, J.J., Wahlberg, B. M. & Caplen, N. J. The 8q24 gene desert: an oasis 
of non-coding transcriptional activity. Front. Genet. 3, 69 (2012). 

2. Haverty, P. M., Hon, L. S., Kaminker, J. S., Chant, J. & Zhang, Z. High-resolution 
analysis of copy number alterations and associated expression changes in ovarian 
tumors. BMC Med. Genomics 2, 21 (2009). 

3. Guan, Y. etal. Amplification of PVT1 contributes to the pathophysiology of ovarian 
and breast cancer. Clin. Cancer Res. 13, 5745-5755 (2007). 

4. van Duin, M. et al. High-resolution array comparative genomic hybridization of 
chromosome 8a: evaluation of putative progression markers for 
gastroesophageal junction adenocarcinomas. Cytogenet. Genome Res. 118, 
130-137 (2007). 

5. Borg, A., Baldetorp, B., Ferno, M., Olsson, H. & Sigurdsson, H. c-myc amplification is 
an independent prognostic factor in postmenopausal breast cancer. Int. J. Cancer 
51, 687-691 (1992). 

6. Kim, Y. H. et al. Combined microarray analysis of small cell lung cancer reveals 
altered apoptotic balance and distinct expression signatures of MYC family gene 
amplification. Oncogene 25, 130-138 (2006). 

7. Sato, K. et al. Clinical significance of alterations of chromosome 8 in high-grade, 
advanced, nonmetastatic prostate carcinoma. J. Nat! Cancer Inst. 91, 1574-1580 
(1999). 


86 | NATURE | VOL 512 | 7 AUGUST 2014 


8. Lapointe, J. et al. Genomic profiling reveals alternative genetic pathways of 
prostate tumorigenesis. Cancer Res. 67, 8504-8510 (2007). 

9. Douglas, E. J. et al. Array comparative genomic hybridization analysis of 
colorectal cancer cell lines and primary carcinomas. Cancer Res. 64, 
4817-4825 (2004). 

10. Zitterbart, K. etal. Low-level copy number changes of MYC genes have a prognostic 

impact in medulloblastoma. J. Neurooncol. 102, 25-33 (2011). 

1. Yamada, T. et al. Frequent chromosome 8q gains in human small cell lung 
carcinoma detected by arbitrarily primed-PCR genomic fingerprinting. Cancer 
Genet. Cytogenet. 120, 11-17 (2000). 

2. Le Beau, M. M., Bitts, S., Davis, E. M. & Kogan, S. C. Recurring chromosomal 

abnormalities in leukemia in PML-RARA transgenic mice parallel human acute 

promyelocytic leukemia. Blood 99, 2985-2991 (2002). 

3. Chin, K. et al. Genomic and transcriptional aberrations linked to breast cancer 

pathophysiologies. Cancer Cell 10, 529-541 (2006). 

4. Jain, A.N. et al. Quantitative analysis of chromosomal CGH in human breast 

umors associates copy number abnormalities with p53 status and patient 

survival. Proc. Nat! Acad. Sci. USA 98, 7952-7957 (2001). 

5. Ramirez-Solis, R., Liu, P. & Bradley, A. Chromosome engineering in mice. Nature 

378, 720-724 (1995). 

6. Al-Kuraya, K. et a/. Prognostic relevance of gene amplifications and 

coamplifications in breast cancer. Cancer Res. 64, 8534-8540 (2004). 

7. Park, K., Kwak, K., Kim, J., Lim, S. & Han, S. c-myc amplification is associated 

with HER2 amplification and closely linked with cell proliferation in 

issue microarray of nonselected breast cancers. Hum. Pathol. 36, 634-639 
(2005). 

8. Guy, C.T. etal. Expression of the neu protooncogene in the mammary epithelium 
of transgenic mice induces metastatic disease. Proc. Natl Acad. Sci. USA 89, 
10578-10582 (1992). 

9. Saji, S. et al. Estrogen receptors o and B in the rodent mammary gland. Proc. Nat! 
Acad. Sci. USA 97, 337-342 (2000). 

20. Carramusa, L. et al. The PVT-1 oncogene is a Myc protein target that is 

overexpressed in transformed cells. J. Cell. Physiol. 213, 511-518 (2007). 

21. Lin, M. et al. RNA-Seq of human neurons derived from iPS cells reveals candidate 
long non-coding RNAs involved in neurogenesis and neuropsychiatric disorders. 
PLoS ONE 6, €23356 (2011). 

22. Zhang, X. et al. Mechanistic insight into Myc stabilization in breast cancer 
involving aberrant Axin1 expression. Proc. Natl Acad. Sci. USA 109, 
2790-2795 (2012). 

23. Yeh, E. et al. A signalling pathway controlling c-Myc degradation that 
impacts oncogenic transformation of human cells. Nature Cel! Biol. 6, 308-318 
(2004). 

24. Wang, X. et al. Phosphorylation regulates c-Myc’s oncogenic activity in the 
mammary gland. Cancer Res. 71, 925-936 (2011). 

25. Morin, P. J. etal. Activation of B-catenin-Tcf signaling in colon cancer by mutations 

in B-catenin or APC. Science 275, 1787-1790 (1997). 

26. Delmore,J.E. etal. BET bromodomain inhibition as a therapeutic strategy to target 

c-Myc. Cell 146, 904-917 (2011). 

27. Darnell, J. E., Jr. Transcription factors as targets for cancer therapy. Nature Rev. 

Cancer 2, 740-749 (2002). 

28. Nair, S. K. & Burley, S. K. X-ray structures of Myc-Max and Mad-Max recognizing 

DNA. Molecular bases of regulation by proto-oncogenic transcription factors. Cell 

112, 193-205 (2003). 

29. Bagchi, A. et al. CHD5 is a tumor suppressor at human 1p36. Cel/ 128, 459-475 
(2007). 

30. Schwertfeger, K. L. et al. A critical role for the inflammatory response in a mouse 
model of preneoplastic progression. Cancer Res. 66, 5676-5685 (2006). 


Acknowledgements We thank A. T. Vogel for writing statistical analysis scripts; 
Research Animal Resources, University of Minnesota, for maintaining the mouse 
colony; S. Horn and L. Oseth for embryonic stem cell blastocyst injection and FISH 
analysis respectively. This work was supported by Masonic Cancer Center Laboratory 
start-up funds (to A.B.), and by grants from the Masonic Scholar Award (to A.B.), the 
Karen Wyckoff Rein in Sarcoma Fund (to A.B.), Translational Workgroup Pilot Project 
Awards by the Institute of Prostate and Urologic Cancer, University of Minnesota 

(to A.B.) and an American Cancer Society Institutional Research Grant (award 
118198-IRG-58-001-52-IRG92, to A.B.). A.T. was supported by an Indo-US fellowship 
from the Indo-US Science and Technology Forum. 


Author Contributions Y.Y.T. and A.B. conceptualized the research programme and 
designed the experiments; Y.Y.T., B.S.M., H.K., A.T., R.A, P.R., B.R., K.G., T.C.B., J.E., Y.K. 
and A.B. performed the experiments. Y.Y.T. and W.G. analysed the data; M.G.O. and 
Y.Y.T. performed the histological analyses; K.L.S., D.A.L, Y.M., Y.K. and A.B. supervised 
experiments and data analysis; A.B. and Y.M. wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

A.B. (bagchO05@umn.edu). 


©2014 Macmillan Publishers Limited. All rights reserved 


Mae Ae dL Tea 


doi:10.1038/nature13602 


Putative cis-regulatory drivers in colorectal cancer 
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The cis-regulatory effects responsible for cancer development have 
not been as extensively studied as the perturbations of the protein 
coding genome in tumorigenesis”. To better characterize colorectal 
cancer (CRC) development we conducted an RNA-sequencing experi- 
ment of 103 matched tumour and normal colon mucosa samples 
from Danish CRC patients, 90 of which were germline-genotyped. 
By investigating allele-specific expression (ASE) we show that the 
germline genotypes remain important determinants of allelic gene 
expression in tumours. Using the changes in ASE in matched pairs 
of samples we discover 71 genes with excess of somatic cis-regulatory 
effects in CRC, suggesting a cancer driver role. We correlate geno- 
types and gene expression to identify expression quantitative trait 
loci (eQTLs) and find 1,693 and 948 eQTLs in normal samples and 
tumours, respectively. We estimate that 36% of the tumour eQTLs 
are exclusive to CRC and show that this specificity is partially driven 
by increased expression of specific transcription factors and changes 
in methylation patterns. We show that tumour-specific eQTLs are 
more enriched for low CRC genome-wide association study (GWAS) 
Pvalues than shared eQTLs, which suggests that some of the GWAS 
variants are tumour specific regulatory variants. Importantly, tumour- 
specific eQTL genes also accumulate more somatic mutations when 
compared to the shared eQTL genes, raising the possibility that they 
constitute germline-derived cancer regulatory drivers. Collectively 
the integration of genome and the transcriptome reveals a substan- 
tial number of putative somatic and germline cis-regulatory cancer 
changes that may have a role in tumorigenesis. 

The non-coding genome has so far been overlooked in the search for 
drivers in cancer, except for some isolated examples'*. The genome 
contains regulatory non-coding germline variation affecting gene expres- 
sion, namely eQTLs, which are major components in complex disease 
predisposition’, and these have not been examined within the context 
of tumorigenesis. Epistasis between eQTLs and coding mutations in 
genes has a role in disease’; therefore it is likely that interactions between 
somatic regulatory variants or eQTLs and coding variation are import- 
ant in tumorigenesis. To examine this we performed an RNA sequen- 
cing analysis of 103 matched tumour and normal colon mucosa CRC 
samples (Supplementary Fig. 2) of the SYSCOL consortium (Supplemen- 
tary Table 1). Ninety samples were genotyped for their germline genome 
and imputed to 1000 Genomes phase 1 release’ (Supplementary Methods). 
We also RNA-sequenced 20 reference tissues to create a more compre- 
hensive control transcriptome for CRC. A general overview and main 
findings are summarized in Supplementary Fig. 3. 

We examined transcriptome perturbations during tumorigenesis by 
identifying differentially expressed genes (DEGs) and find 1,676 DEGs 
(false discovery rate (FDR) = 5%, fold change =2) (Supplementary 
Table 2 and Supplementary Fig. 4). The functional terms enriched in 
DEGs are given in Supplementary Table 3, and pathways impacted, which 
include known CRC pathways, are given in Supplementary Table 4. There 
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is no significant difference of differential expression between known 
CRC driver genes®’ or The Cancer Genome Atlas (TCGA) pan-cancer 
genes”, and expression matched random genes (Supplementary Fig. 5). 
In addition, we find 213 differentially spliced genes (FDR = 5%) (Sup- 
plementary Table 5, Supplementary Figs 6 and 7). Among these are 
TGFBI (ref. 9), RASA] (ref. 10), SLC26A3 (ref. 11) and SLC39A 14 (ref. 12), 
which have been previously implicated in CRC. The enriched func- 
tional terms are given in Supplementary Table 6. 

We tested the relationship of samples’ gene expression and searched 
for regions of the genome in which CRC modifies the correlation of pro- 
ximal gene expression. We observe that the tumour samples, normal 
samples, and the 20 reference tissue samples form three distinct clus- 
ters (Supplementary Fig. 8), and the variance of the CRC transcriptome 
is increased when compared to the normal samples (Supplementary 
Fig. 9). In all chromosomes there are blocks of proximal genes whose 
correlations are altered in CRC (Supplementary Figs 10 and 11, and Sup- 
plementary Table 7) pointing to coordinated regulation likely due to 
consistent epigenetic effects and locus control regions. The results above 
suggest a rewiring of the regulatory networks in CRC. 

To further dissect cis-regulatory effects in CRC we examined variation 
in regulation of gene expression. One powerful method for discovering 
cis-regulatory variability is ASE analysis, which was conducted as described 
previously’*. We find that approximately 10% of coding heterozygous 
sites exhibit significant ASE (FDR = 1%, P < 0.001). The proportion of 
sites with ASE is significantly higher in tumours compared to normal 
samples (Fig. la) and approximately 34% of ASE is CRC-specific. This 
excess is likely to be an indication of copy number alterations (CNAs) 
that have been described previously*’*”°. Furthermore, in approximately 
10% of ASE sites significant in both matched samples, the direction 
of the ASE effect is reversed (Fig. 1b), also indicating genes found in 
potential CNA regions or genes where the cis-regulatory landscape has 
been altered. 

We propose that the somatic events in the regulatory regions of genes 
are reflected in the difference of allelic ratios between matched pairs at 
expressed heterozygous sites. We define ASE somatic events (Supplemen- 
tary Methods) and observe a significant correlation (Spearman’s rho = 
0.05, P= 8.8 X 10 ”) between ASE somatic event rates and coding 
somatic mutation rates'® of genes, indicating that somatic ASE events 
may be more likely to be selected in genes implicated in CRC (Fig. 2a 
and Supplementary Fig. 12). To determine a score of somatic dysregu- 
lation we compared the ASE somatic event rate of each gene to all other 
genes (Supplementary Methods). The score is defined as the enrich- 
ment of low P values (,)’’. We observe a bimodal distribution of 1, 
values where most genes are similar to other genes but a small fraction 
of genes demonstrate high 1, representing genes with dysregulation 
(Supplementary Fig. 13 and Supplementary Table 8). Known CRC 
driver genes®’ and TCGA pan-cancer genes”* have significantly higher 
m™, compared to other genes (Supplementary Fig. 14). Therefore, the 7, 
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Figure 1 | Allele-specific expression. a, Boxplots showing the distribution of 
the proportion of heterozygous sites with significant (P < 0.001) ASE in normal 
and tumour samples. This is significantly higher in tumours, which is 
suggestive of CNA. b, Allele ratios of significant (P < 0.001) ASE sites in 
tumours versus matched normals. Approximately 34% of the ASE sites are 
tumour-specific (red points) and approximately 10% of the shared ASE sites 
(black and orange points) show reversal of the effect (orange points). This is 
indicative of CNA or cis-regulatory changes in these samples. c, Boxplots 
showing the distributions of pairwise distances between pairs of samples 
calculated from ASE ratios. Allelic expression is more similar between tumours 
and their matched controls (magenta) than other comparisons indicating that 
germline genotypes remain important determinants of gene expression even 
after tumorigenesis. All distributions are significantly different from each 
other (Mann-Whitney U-test, P< 2.2 X 107 ?°). In the boxplots the black 
horizontal line represents the median, the boxes are delimited by versions of the 
first and the third quartile, and whiskers extend to 1.5 times the box length with 
points outside of these represented as circles. 


score contains information about the driving capability of cis-regulatory 
somatic mutations in highly scoring genes. 

To define genes with an excess of somatic cis-regulatory events we 
used principles similar to those used for identifying excess of protein- 
coding somatic events (Supplementary Methods). We detected 71 genes 
(FDR = 5%) with significantly higher ASE somatic event rates, which 
we define as genes with allelic dysregulation (GADs) (Supplementary 
Methods, Fig. 2c and Supplementary Table 8), which also have high 7, 
(Fig. 2b). These GADs are significantly enriched for TCGA pan-cancer 
drivers’* and known CRC driver genes®” (seven pan-cancer genes P = 
0.035, two CRC genes P = 0.039; Supplementary Methods), indicating 
that we are capturing known cancer genes. This suggests two scenarios— 
somatic coding mutations and regulatory mutations undergo epistatic 
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selection, or genes become involved in tumorigenesis in the presence 
of either—and we believe that both are likely explanations. We tested 
whether GADs can be driven by overexpression of genes in tumours, 
and observe that GADs are not significantly more overexpressed than 
other genes (Mann-Whitney U-test, P = 0.1; Supplementary Fig. 15). 
Weassessed whether systematic CNAs were responsible for GADs (Sup- 
plementary Methods). There is no significant clustering of GADs (Sup- 
plementary Fig. 16) and although 25% of the ASE somatic events in 
GADs overlap with CNAs, this is lower than non-GADs (30%) and 
is not significantly different between GADs and non-GADs (Sup- 
plementary Fig. 17). These two results together suggest that the con- 
tribution of CNAs to the identification of GADs is not significant. Our 
ASE methodology, while taking into account many of the known biases, 
remains imperfect. However, this does not have significant influence in 
our analysis, as the majority of biases are expected to be shared between 
normal samples and tumours. Therefore by using recurrent ASE differ- 
ences in matched tumour and normal samples at germline sites as a 
proxy to the changes in the somatic regulation of genes, we have deter- 
mined a set of genes with putative cis-regulatory driving mutations. 

To assess the maintenance of germline cis-regulatory effects between 
normal and cancer tissue we looked at patterns of ASE. We calculated ASE 
distance of tumour and normal samples to normal and tumour samples 
in the same individual and other individuals (Supplementary Fig. 18). 
This shows that a tumour sample is most similar to its matched normal 
colon mucosa sample (Fig. 1c), meaning that allelic expression of most 
genes is conserved after tumorigenesis. Thus the germline genotypes 
remain key determinants of allelic gene expression in tumours. 

One of the open questions in tumorigenesis is whether non-coding 
germline variants contribute as driving factors'*. We propose that if 
such variants existed they would affect gene expression specifically in 
tumours. To address this question we conducted a cis-eQTL analysis 
(Supplementary Methods). We find 1,693 and 948 eQTLs (permutation 
P<0.01, FDR 9.4 and 16.1) in normal samples and tumours, respec- 
tively (Supplementary Tables 9 and 10). At this threshold, 61% of the 
tumour eQTLs and 78% of normal eQTLs appear to be tissue specific, 
with 368 shared eQTLs (Fig. 3c). Using a more sensitive approach’” we 
estimate that 64% of tumour eQTLs and 62% of the normal eQTLs are 
shared (Fig. 3d). We find stronger eQTL effects and more sharing closer 
to the transcription start site (TSS) (Fig. 3a). The effect sizes of eQTLs 


Figure 2 | Genes that are 
significantly dysregulated in CRC. 
a, The correlation of ASE somatic 
event rate and non-synonymous 
(NS) coding somatic mutation rates 
of genes (the grey crosses are zeros 
that were transformed to the 
minimum non-zero value). 

b, Distribution of the 7, score for 
dysregulation. High 7, scores 
indicate genes that are dysregulated 
in CRC. Grey, distribution for all 
genes with a 7, > 0; red, distribution 
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Figure 3 | cis-eQTLs. a, Tissue specificity and 
distance of eQTL to transcript start site (TSS). 
The shared eQTLs (red) are closer to the TSS than 
are the tissue specific eQTLs (black) (Spearman’s 
rho = —0.28, P< 2.2 X 10 1°) and distance to 
TSS and significance are correlated (Spearman’s 
rho =—0.44, P< 2. X 10 '). b, Effect sizes of 
eQTLs. The direction of the effect is conserved in 
the shared eQTLs meaning that the germline 
genotypes are the main drivers in tumour eQTLs. 
c, Mosaic plot of tissue specificity of eQTLs. Of 
eQTLs, 61% and 78% are specific to the tumours 
and to the normal samples, respectively. d, P-value 
distributions of significant SNP-exon pairs tested 
in the other tissue. The 7, statistic estimates the 
tissue sharing of eQTLs. e, Distributions of the 
proportion of large intestine samples in the 
COSMIC database with a confirmed protein 
altering somatic mutation for the shared and 
tumour specific eQTL genes. Tumour specific 
eQTL genes accumulate significantly more somatic 
mutations (Mann-Whitney U-test P = 7 X 10 1”) 
making some of these likely germline regulatory 
drivers. In the boxplots the black horizontal line 
represents the median, the boxes are delimited by 
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show that the direction of shared eQTL effects is conserved after tumor- 
igenesis (Fig. 3b), thus we expect the tumour-specific eQTLs to be similarly 
driven by germline genotypes. A strict set of tumour-specific QTLs was 
defined (Supplementary Methods) and we assessed the differential expres- 
sion between shared eQTLs and tumour-specific eQTLs. Although there 


versions of the first and the third quartile, and 
whiskers extend to 1.5 times the box length with 
points outside of these represented as circles. 


eQTLsS eQTLsS 


is a very small (median 1.05-fold) but significant (Mann-Whitney U-test, 
P= 0.02; Supplementary Fig. 19) increase in expression of tumour- 
specific eQTL genes, it is not sufficient to explain the tumour-specific 
eQTLs. Using an interaction model with tumour stage we assessed whether 
behaviour of the tumour-specific eQTLs change as cancer progresses. 
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Figure 4 | Functional enrichments of eQTLs. a, The bar plot is ordered by the 
tumour specific enrichment. The null (frequency and distance matched) is 

represented as the black horizontal line. The numbers above each bar are the 
—log)o P values of the enrichment, tumour-specific first followed by the shared. 
b, The ratio of the tumour specific enrichment to the shared enrichment (in log 


scale) for significantly differentially expressed transcription factors are shown 
as grey bars. The red line depicts the differential expression fold change of the 
transcription factors. The first six transcription factors (IRX3, E2F4, NFIL3, 
TFAP2A, CUX1 and LEF1), where we have higher enrichments in the tumour 
specific eQTLs, are also upregulated in CRC. 
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Although no results survive multiple test correction, the top hit is THBS3 
(nominal P = 0.0007), which is a stimulator of tumour progression in 
osteosarcoma”, making it a plausible candidate for a stage-regulated eQTL. 

To assess whether the eQTL variants were biologically functional, 
we used the Ensembl Regulatory Build to look for enrichment of eQTL 
variants in non-coding functional regions (Supplementary Methods). 
We find significant enrichments for many marks (Supplementary Fig. 20), 
highlighting the functional relevance of the variants discovered. To under- 
stand whether different types of functional effects were driving tumour- 
specific eQTLs and shared eQTLs we used the ChIP-seq (chromatin 
immunoprecipitation followed by sequencing) peaks from the colon 
cancer LoVo cell line” (Fig. 4a). We observe binding sites of six transcrip- 
tion factors that have stronger enrichment in the tumour-specific eQTLs. 
All of these transcription factors are also significantly upregulated in CRC 
(Fig. 4b and Supplementary Table 11) and there is a significant positive 
correlation between the tumour-specific eQTL enrichment to shared 
eQTL enrichment ratio and fold change in the expression of the cor- 
responding transcription factors (r = 0.24, P = 0.01; Supplementary 
Fig. 21), indicating that differential expression of these factors are likely 
to be driving a proportion of the tumour-specific eQTLs. We also assessed 
methylation patterns and find that there is a significant increase in dif- 
ferential methylation in tumour-specific eQTL variant regions compared 
to shared eQTLs. (Mann-Whitney U-test, P = 2.5 x 10 ”, Supplemen- 
tary Methods and Supplementary Fig. 22). The changes in methylation 
indicate regulatory switches responsible for some of the tumour-specific 
eQTL effects. We also estimate’ that up to 38% tumour-specific eQTLs 
are active in at least another healthy tissue (Supplementary Methods 
and Supplementary Table 12). Finally, we tested for enrichment of low 
Pvalues in CRC GWAS amongst the eQTLs (Supplementary Methods). 
The highest levels of enrichment for low GWAS P values is seen in tumour- 
specific eQTLs (1, of 11% versus 7% in shared eQTLs; Supplementary 
Fig. 23), signifying that a proportion of the CRC GWAS signals are cis- 
regulatory variants active only in the tumours. 

To test whether the tumour-specific eQTLs are likely to be drivers in 
CRC, we proposed that the cis-regulatory changes may have similar 
impact to somatic mutations in genes with the same or similar func- 
tion; therefore we would expect to see an increased somatic mutation 
rate in the tumour-specific eQTL genes. We compared the proportion 
of CRC samples in the COSMIC“ database that had a protein altering 
somatic mutation between the 376 tumour-specific eQTL genes and 
the 368 shared eQTL genes, and find that tumour-specific eQTL genes 
accumulate significantly more somatic mutations (Mann-Whitney U- 
test, P=7 X 10 "7, Fig. 3e and Supplementary Table 13). Moreover, 
there is a significant 2.5-fold enrichment (Fisher’s exact test, P = 0.02) 
for TCGA pan-cancer driver genes”* in tumour-specific eQTL genes 
(20 versus 8 in shared eQTL genes), making these likely non-coding 
cis-regulatory drivers in CRC. To avoid any potential detection bias we 
tested whether the tumour-specific eQTL genes also accumulate more 
mutational events by comparing inferred CNA in shared eQTL genes 
versus tumour-specific eQTLs using ASE estimates. There is no sig- 
nificant increase in inferred CNA of tumour-specific eQTLs (Fisher’s 
exact test, P= 0.5; Supplementary Fig. 24). Collectively these results 
reveal germline variants, whose functions are activated under the tumour 
state, some of which are likely to be selected in tumour progression. 

Here we present the allelic transcriptome changes that occur in CRC 
tumorigenesis. We discover 71 GADs and 376 genes with tumour spe- 
cific germline cis-regulatory variants. Both categories demonstrate char- 
acteristics that support their role as putative cancer drivers. This gives us 
access to putative non-coding somatic and germline CRC drivers on an 
unprecedented scale. In addition, tumour specific cis-eQTLs reveal a 
new category of variants that are likely to contribute to cancer besides 
predisposing alleles and somatic mutations. It is likely that some of the 
predisposing variants discovered via GWAS are in fact germline drivers. 
We demonstrate that integration of genome and transcriptome fol- 
lowed by ASE and eQTL analysis in normal-tumour matched samples 
can be used to identify important non-coding regulatory effects in cancer. 
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Transcriptional enhancers are crucial regulators of gene expression 
and animal development’ and the characterization of their genomic 
organization, spatiotemporal activities and sequence properties is a 
key goal in modern biology’ *. Here we characterize the in vivo activity 
of 7,705 Drosophila melanogaster enhancer candidates covering 13.5% 
of the non-coding non-repetitive genome throughout embryogen- 
esis. 3,557 (46%) candidates are active, suggesting a high density with 
50,000 to 100,000 developmental enhancers genome-wide. The vast 
majority of enhancers display specific spatial patterns that are highly 
dynamic during development. Most appear to regulate their neigh- 
bouring genes, suggesting that the cis-regulatory genome is organized 
locally into domains, which are supported by chromosomal domains, 
insulator binding and genome evolution. However, 12 to 21 per cent 
of enhancers appear to skip non-expressed neighbours and regulate 
a more distal gene. Finally, we computationally identify cis-regulatory 
motifs that are predictive and required for enhancer activity, as we 
validate experimentally. This work provides global insights into the 
organization of an animal regulatory genome and the make-up of 
enhancer sequences and confirms and generalizes principles from 
previous studies’”. All enhancer patterns are annotated manually 
with a controlled vocabulary and all results are available through a 
web interface (http://enhancers.starklab.org), including the raw images 
of all microscopy slides for manual inspection at arbitrary zoom levels. 

Animal development depends on differential gene expression gov- 
erned by genomic regulatory elements called enhancers’, which are being 
studied extensively***"°"". Many of the basic principles of develop- 
mental gene regulation have been elucidated in the fruitfly Drosophila 
melanogaster’, and work over the past decades has characterized 
gene expression, transcription factor binding, chromatin features and 
enhancer activity in Drosophila at unprecedented levels**-*'*"'°. This 
and the ability to obtain many embryos from all developmental stages’” 
make Drosophila an ideal model in which to characterize spatiotemporal 
enhancer activities at a genomic scale and throughout embryogenesis. 

To systematically characterize developmental enhancers in the D. 
melanogaster genome, we made use of transgenic fly lines (Vienna Tiles 
(VT) library), publicly available from the Vienna Drosophila RNAi Center 
(VDRC). Each line contains a transcriptional reporter construct with a 
~2 kilobase (kb) genomic DNA fragment (enhancer candidate), mini- 
mal promoter and GAL4 reporter gene integrated into an identical posi- 
tion in the fly genome”, thus allowing the direct comparison of the 
candidates’ activities (Fig. 1a, Extended Data Fig. 1a, b and Supplemen- 
tary Table 1). Together, these fragments cover about 14 million base pairs 
or 13.5% of the non-coding, non-repetitive genome, with little or no bias 
regarding the distance to transcription start sites (TSSs; Extended Data 
Fig. 1c) or the embryonic expression of neighbouring genes (Extended 
Data Fig. 1d). 

We developed a high-throughput pipeline to assess transcriptional 
enhancer activities in fly embryos by in situ hybridization against the 
GAL4 reporter transcript. For each transgenic line, we acquired whole-slide 


images, each with about 400 embryos covering all stages of embryo- 
genesis, and manually annotated the enhancer activity patterns using a 
controlled vocabulary” at six time intervals of embryogenesis (Extended 
Data Fig. le). The pipeline reported activities independent of fragment 
delineation and orientation and recovered 27 out of 28 known enhancers, 
whereas 13 out of 13 non-Drosophila controls were inactive (Extended 
Data Fig. 2a—c and Supplementary Information section 1). Results from 
re-testing 34 negative and 78 positive fragments using a different genomic 
site (on chromosome 2L instead of 3L) and reporter gene (JexA) were 
highly similar and the majority (82%) of enhancer activity patterns 
matched to the expression patterns of neighbouring genes, suggesting 
that we predominantly measured endogenous enhancer activities (Ex- 
tended Data Fig. 2d-f and Supplementary Information section 1). 
3,557 of all 7,705 tested candidate fragments (46%) were active in the 
embryo with diverse patterns that included gap and pair-rule patterns, 
all primary germ layers (Extended Data Figs 3a and 4a), and all major 
cell types and tissues (Fig. 1b and Extended Data Figs 3b and 4b). The 
fraction of active fragments increased about fivefold from ~7% in early 
embryos (stages 4-6) to ~35% for stages 15-16, consistent with the increase 
in organism complexity and the number of cell types (Fig. 1c). By con- 
trast, the number of expressed genes remains roughly constant during 
embryogenesis (~1.3-fold increase’*). Enhancer activities were much 
sparser than gene expression patterns both temporally and spatially: 
while 94% of all enhancers were only transiently active and only 0.8% 
were ubiquitous during the entire embryogenesis, this was true for 56.7% 
and 20.5% of the genes, respectively (Fig. 1d, e and Extended Data Fig. 5a—c). 
The temporal dynamics of enhancer activity was also apparent from 
changes of enhancer-associated chromatin features such as DNase I 
hypersensitivity (DHS), binding of co-activator CBP/p300, and pres- 
ence of histone H3K27 acetylation mark assessed in entire embryos or 
in a tissue-specific manner*’®”° (Fig. 1f-h, Extended Data Fig. 6 and 
Supplementary Information section 2). Together, this confirms and 
quantifies the transient and dynamic nature of enhancer function and 
suggests that development progresses through increasingly complex gene 
regulation by enhancers with temporally and spatially restricted activities. 
We next identified domains in the blastoderm embryo in which enhanc- 
ers appeared co-regulated (that is, were coordinately active or inactive). 
Automated image segmentation and reverse clustering revealed distinct 
regions corresponding to the presumptive anterior and posterior endo- 
derm, head and trunk mesoderm, procephalic neuroectoderm, and others, 
overall strongly resembling the established fate map of the blastoderm 
embryo” (Fig. li and Extended Data Fig. 4c). This suggests that cells 
within these domains have a common developmental fate, presumably 
due to shared trans-regulatory environments. Indeed, during late stages, 
early mesodermal enhancers were preferentially active in mesoderm deriv- 
atives (somatic, visceral and cardiac muscles), whereas early endodermal 
enhancers were active in endoderm derivatives (midgut and Malpi- 
ghian tubules) (Extended Data Fig. 4d). These and equivalent trends for 
other presumptive tissues of the early embryo (Extended Data Fig. 4e-g) 
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Figure 1 | Enhancers display highly diverse and dynamic activity patterns 
across Drosophila development. a, The VT library comprises transgenic 
flies with candidate fragments (blue) upstream of a transcriptional reporter 
(middle) in a constant genomic landing site (Extended Data Fig. 1). 

b, Proportion of enhancer activities in prominent tissues at stages 13-14 
(representative embryos; Extended Data Figs 3 and 4). VNC, ventral nerve cord. 
c, The number of active enhancers increases during embryogenesis, with 
some overlap between early and late enhancers (Venn diagram in c). d, 3,329 
(94%) embryonic enhancers are only transiently active. e, Temporal dynamics 
of gene expression (left, 5,134 genes’) and enhancer activity (right, 3,557 


demonstrate that enhancer activities are consistent with the progres- 
sion of development along defined cell lineages, highlighting the gene 
regulatory basis of development. 

To analyse the locations of enhancers with respect to their putative tar- 
get genes, we assigned enhancers to genes by manually matching enhancer 
activity and gene expression patterns (Figs 2a and 3a). For the 874 enhanc- 
ers with the strongest activity patterns, we considered 3,681 genes within 
five genes up- and downstream of each enhancer (including host genes 
for intronic enhancers), that is, 9,293 enhancer-gene pairs. For 4,224 of 
these pairs (45%; 1,690 genes), expression patterns were available, result- 
ing in 482 enhancer-to-gene assignments (of the enhancers for which all 
neighbouring genes were characterized, 82% could be assigned; Extended 
Data Fig. 2f and Supplementary Table 4). The assignments were sup- 
ported by the location of chromosomal domain boundaries”', binding 
sites of insulator proteins” and evolutionary chromosome breakpoints”’, 
all of which were depleted between enhancers and their assigned targets 
(Fig. 2b-d, Extended Data Fig. 7a—c and Supplementary Information 
section 3). Twenty-eight enhancers were assigned to and potentially regu- 
late two genes, 23 of which were paralogues with very similar expression 
patterns (Supplementary Information section 4). During stages 4-6, 16 
genes were assigned to enhancers with overlapping or identical activ- 
ities reminiscent of shadow enhancers™. This is a considerable fraction 
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enhancers; black vertical lines indicate continuous expression or activity; 
Extended Data Fig. 5a). f-h, Heatmaps show the median enrichment of DNA 
accessibility” (f), CBP/p300 binding (g) and H3K27 acetylation (ac) marks” 
(h) on early (E), middle (M), late (L) and continuous (C) enhancers (rows) 
for experiments performed at different time points during Drosophila 
development (columns; red highlights coinciding time points; Extended Data 
Fig. 6). i, Co-regulated domains defined by reverse clustering of raw image 
data for 429 early enhancers resemble the embryo fate map’ (Extended Data 
Fig. 4c). ChIP, chromatin immunoprecipitation; NE, neuroectoderm. 


(14%) among all 116 genes with multiple enhancers, in particular for 
developmental regulators (14 out of 16 genes are transcription factors; 
Supplementary Information section 5). 

Along the linear genomic DNA sequence, the distances between the 
enhancers and the TSSs of their assigned target genes varied greatly: 
although many such pairs were close (21% were <4kb), the median 
distance was 10 kb, and 28% of all inferred regulatory interactions were 
distal (>20 kb), up to more than 100 kb (Fig. 2e). However, consider- 
ing the location of genes, the vast majority (88%) of all enhancers were 
located in the vicinity of their targets (Fig. 2f). Nevertheless, 12% of all 
enhancers were assigned across intervening genes and appeared to skip 
one (8%) or more (4%) genes to regulate a distal gene (Fig. 2f), as found 
for a Sex combs reduced (Scr) enhancer that lies beyond the fushi tarazu 
(ftz) gene”’. Interestingly, enhancers were located almost as frequently 
upstream (30%) as downstream (22%) of their target genes (for example, 
the SoxNeuro (SoxN) locus; Extended Data Fig. 8), suggesting that no 
particularly preferred relative enhancer location might exist. 

Thirty-six per cent of the enhancers were intragenic and appeared to 
predominantly (79%) regulate their host genes, as exemplified by Throm- 
bospondin (Tsp; Fig. 3a). However, 21% were assigned to a neighbouring 
gene instead (Fig. 3b), including an enhancer located inside the intron 
of bric a brac 1 (bab1) that appears to activate bab2 over a distance of 
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Figure 2 | The organization of the Drosophila cis-regulatory genome. 

a, Enhancer to target gene assignment based on enhancer activity and gene 
expression patterns. b-d, Chromosomal domain boundaries determined by 
Hi-C”' (b), breakpoints during genome evolution” (c) and insulator binding 
sites” (d) show relative depletions between enhancers and their assigned target 
genes (blue) and enrichments between enhancers and non-targets (red), 
whereas the opposite is true for the activator Trl” (binomial P values are shown; 
see Extended Data Fig. 7a—c for additional insulator proteins and details). 

e, Genomic distances between enhancers and their assigned target gene TSSs in 
kb (grey, frequencies; black, cumulative). f, Frequency of enhancers (purple) 
at different genomic positions relative to their target genes (blue; schematic 
locus). Eighty-eight per cent of all enhancers are in the genes’ genomic 
neighbourhoods within regulatory domains. CB, chromosomal breakpoints; 
DB, domain boundaries. 
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93 kb (Fig. 3c). bab1 is not detectably expressed in the embryo during 
the corresponding developmental stage, which we found to be true 
more generally when intragenic enhancers regulated flanking genes rather 
than their host genes (Fig. 3d). Similarly, when intergenic enhancers were 
assigned to distal genes, the skipped genes were significantly less highly 
expressed than the target genes (Fig. 3d). Together these results sup- 
port a predominantly local organization of the Drosophila genome into 
regulatory domains reminiscent of the chromosomal domains inferred 
from chromatin interactions”. 

The agreement of most enhancers’ activities with the expression pat- 
terns of neighbouring genes (Figs 2f, 3a and Extended Data Figs 2f, 8) 
confirms that enhancer activity is predominantly context independent’. 
However, 18% of the enhancers could not be assigned to neighbouring 
genes (Extended Data Fig. 2f) and might be involved in more distal reg- 
ulation (for example, ref. 26). For 19%, the activities were similar but 
broader and might thus be modulated in the endogenous sequence con- 
texts in a more complex fashion (Supplementary Information section 1). 
Such context dependence is known for several loci in Drosophila (for 
example, the Hox locus”’) and mouse (for example, Fef8 (ref. 28)), and 
enhancers in the bithorax complex indeed matched to gene expression 
patterns during early stages but appeared broader later (Extended Data 
Fig. 7d). 

Many different enhancers showed similar or identical activity patterns 
in various embryonic tissues. For example, 263 were active throughout 
the central nervous system (CNS), 59 in midgut and 32 in macrophages 
(Extended Data Fig. 3), thus probably providing sufficient statistical 
power to discern predictive sequence signatures. Indeed, the motif con- 
tent alone allowed the discrimination of enhancers from different func- 
tional classes using supervised machine learning in a cross-validated 
setting’? (Extended Data Fig. 9a, b and Supplementary Table 5). The 
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Figure 3 | Intragenic enhancers in the Drosophila genome. a, Enhancers in 
the Tsp locus. Top, UCSC Genome Browser screenshot including tested 
fragments (purple, positive; grey, negative) and DNA accessibility”®. Bottom, 
embryos for all six time points of embryogenesis (left, in situ visualizing Tsp 
mRNA", arrows highlight small expression/activity domains). b, Twenty-one 
per cent of intragenic enhancers are assigned to a neighbouring gene. c, A distal 
bab2 enhancer (VT23828) in the intron of a neighbouring gene bab1. Top, 
UCSC Genome Browser screenshot including RNA-seq data for the 
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corresponding stages'*. Bottom, embryo images depicting the bab1 and bab2 
expression during stages 13-14 (ref. 14) and VT23828’s activity in the 
proventriculus (middle). d, Non-regulated host and skipped genes are often not 
expressed. Box plots show gene expression (reads per kilobase per million 
(RPKM)) values as measured by RNA-seq’* for assigned target genes (blue) and 
non-regulated host genes (red, left) or skipped genes (red, right). Dark grey, 
unrelated neighbouring genes (control); light grey, all D. melanogaster genes. 
*e*D — 108 **P = 0.059, *P = 0.081, ~P > 0.1. Wilcoxon rank-sum test. 
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fraction of classes for which predictions were successful increased with 
the number of enhancers per class (Supplementary Table 5) but appeared 
to be independent of pattern complexity. This suggests that our under- 
standing of regulatory sequences will benefit from the ongoing func- 
tional characterization of enhancers*”’. 

Different transcription factor motifs were strongly differentially dis- 
tributed between the enhancer classes (Fig. 4a and Extended Data Fig. 9c). 
For example, early embryonic enhancers were enriched in motifs of the 
transcription factor Zelda, an important activator of embryonic gene 
expression”®. Similarly, Twist (Twi) motifs were enriched in early meso- 
dermal enhancers, Myocyte enhancing factor 2 (Mef2) motifs in late 
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Figure 4 | Prediction and validation of cis-regulatory motif requirements 
for tissue-specific enhancer activities. a, Global cis-regulatory map of 
transcription factor motif enrichments in sequences of enhancers active in 
different tissues/cell types. Highlighted are Trl (GAGA) and CAC(N)NCAC- 
like motifs enriched in CNS and ubiquitous enhancers (1) and GATA-like 
motifs enriched in midgut enhancers (2; see Extended Data Fig. 9c for the entire 
map). D-V, dorso-ventral. Zld, Zelda. b, Experimental validation of predicted 
cis-regulatory motif requirements. Shown are the most discriminative motifs 
(left), representative enhancers active in the midgut (stages 13-15), broad CNS 
(stages 15-16) and A-P system (stages 4-6) and their motif mutant variants 
(middle), and a quantification of the staining (st) intensities (right; all 
P=<7xX10 Kolmogorov-Smirnov; see Extended Data Figs 9a, b and 10 
for details and eight additional enhancers). WT, wild type. 
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somatic muscle enhancers, and Pannier (Pnr) and Tinman (Tin) motifs 
in dorsal vessel enhancers, consistent with the established roles of the 
these transcription factors® (Fig. 4a and Extended Data Fig. 9c). To test 
whether predicted motifs are required for enhancer activity, we selected 
three midgut, four CNS and four anterior—posterior (A-P) enhancers 
(11 enhancers total), for which the successful predictions depended on 
GATA-like, Trithorax (Trl, also known as GAGA)-like, and Tramtrack 
(Ttk)-like motifs, respectively (Fig. 4a and Extended Data Fig. 9b, c). 
For each, we created reporter flies with an enhancer variant in which we 
disrupted the respective motifs by point mutations and compared the 
activity of the mutant and wild-type enhancers, both manually and by 
computational image analysis (Fig. 4b and Extended Data Fig. 10). In 
10 out of 11 cases, the mutated enhancers were not active or had strongly 
reduced activity, validating the functional importance of the respective 
motifs. 

Taken together this work complements efforts that study chromatin 
properties~’*”° or characterize enhancers at defined stages and in selected 
tissues*”"*"°, Our results confirm and generalize principles and models 
from smaller scale studies (reviewed in refs 1, 9, 12) and suggest a high 
density of developmental enhancers in the Drosophila genome with an 
estimated total of ~41,000 enhancers or four enhancers per expressed 
protein-coding gene on average during embryogenesis alone. In addi- 
tion, considering that enhancers that are exclusively active in larvae, 
pupae or the adult fly>”"*"° (Supplementary Information section 6), we 
estimate between at least 50,000 to 100,000 developmental enhancers in 
the 170-megabase D. melanogaster genome. Even though the genome 
sequence properties (for example, repeat content and gene density) differ, 
this suggests that the 3-gigabase human genome could contain up to 
several million enhancers. In summary, the functional characterization 
of enhancers during the entire Drosophila embryogenesis adds a new 
level of functional annotation to the well-studied fly genome and eluci- 
dates global principles of cis-regulatory genome organization in animals, 
the importance of which for development, physiology, evolution and 
disease is becoming increasingly evident. 


METHODS SUMMARY 


Weassessed enhancer activities of 7,705 genomic fragments of about 2 kb in embryos 
of transgenic GAL4-reporter (VT) fly strains obtained from the VDRC (http://stock 
center.vdrc.at/) by in situ hybridization. Embryos of each VT strain were manually 
annotated with a controlled vocabulary and positive strains were imaged. Motif ana- 
lyses and support vector machine (SVM) predictions were performed as described 
in ref. 29. All fragment coordinates and annotations are in Supplementary Table 1 
and at http://enhancers.starklab.org/. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Enhancer loops appear stable during development 
and are associated with paused polymerase 


Yad Ghavi-Helm', Felix A. Klein'*, Tibor Pakozdi'*, Lucia Ciglar', Daan Noordermeer*, Wolfgang Huber' & Eileen E. M. Furlong! 


Developmental enhancers initiate transcription and are fundamental 
to our understanding of developmental networks, evolution and dis- 
ease. Despite their importance, the properties governing enhancer- 
promoter interactions and their dynamics during embryogenesis 
remain unclear. At the B-globin locus, enhancer-promoter interac- 
tions appear dynamic and cell-type specific’*, whereas at the HoxD 
locus they are stable and ubiquitous, being present in tissues where 
the target genes are not expressed**. The extent to which preformed 
enhancer-promoter conformations exist at other, more typical, loci 
and how transcription is eventually triggered is unclear. Here we gen- 
erated a high-resolution map of enhancer three-dimensional contacts 
during Drosophila embryogenesis, covering two developmental stages 
and tissue contexts, at unprecedented resolution. Although local reg- 
ulatory interactions are common, long-range interactions are highly 
prevalent within the compact Drosophila genome. Each enhancer 
contacts multiple enhancers, and promoters with similar expression, 
suggesting a role in their co-regulation. Notably, most interactions 
appear unchanged between tissue context and across development, 
arising before gene activation, and are frequently associated with 
paused RNA polymerase. Our results indicate that the general to- 
pology governing enhancer contacts is conserved from flies to humans 
and suggest that transcription initiates from preformed enhancer- 
promoter loops through release of paused polymerase. 

Drosophila embryogenesis proceeds very rapidly, taking 18h from 
egg lay to completion. Underlying this dynamic developmental program 
are marked changes in transcription, which are in turn regulated by 
characterized changes in enhancer activity. However, the role and extent 
of dynamic enhancer looping during this process remains unknown. To 
address this, we performed 4C-seq (chromosome conformation capture 
sequencing) experiments’ anchored on 103 distal or promoter-proximal 
developmental enhancers (referred to as ‘viewpoints’; Extended Data 
Fig. 1a), and constructed absolute and differential interaction maps for 
each, varying two important parameters: (1) developmental time, using 
embryos at two different stages, early in development when cells are 
multipotent (3-4h after egg lay; stages 6-7), and mid-embryogenesis 
during cell-fate specification (6-8 h; stages 10-11); and (2) tissue con- 
text, comparing enhancer interactions in mesodermal cells versus whole 
embryo. To perform cell-type-specific 4C-seq in embryos, we established 
a modified version of BiTS-ChIP (batch isolation of tissue-specific chro- 
matin for immunoprecipitation)*. Nuclei from covalently crosslinked 
transgenic embryos, expressing a nuclear-tagged protein only in meso- 
dermal cells, were isolated by fluorescence-activated cell sorting (FACS; 
(>98% purity) and used for 4C-seq on 92 enhancers at 6-8h and a 
subset of 14 enhancers at 3-4h. The same 92 enhancers, and 11 addi- 
tional regions, were also used as viewpoints in whole embryos at both 
time points (Extended Data Fig. 1b and Supplementary Table 1). The 
enhancers were selected based on dynamic changes in mesodermal tran- 
scription factor occupancy between these developmental stages”* and 
the expression of the closest gene’. We thereby primed this study to detect 
dynamic three-dimensional (3D) interactions, focusing on developmental 


stages during which the embryo undergoes marked morphological and 
transcriptional changes. 

All 4C-seq experiments had the expected signal distribution’, with 
high concordance between replicates (median Spearman correlation 
0.93). To assess data quality further, we examined ten known enhancer- 
promoter pairs (of the ap, Abd-b, E2f, pdm2, Con, eya, stumps, Mef2, sli 
and slp1 genes), and in all cases recovered the expected interactions 
(Fig. 1 and Extended Data Fig. 1c-l). For example, using an enhancer 
of the apterous (ap) gene”, we detect the expected interaction with the 
ap promoter, 17 kilobases (kb) away (Fig. 1), illustrating the high qual- 
ity and resolution of the data. 

In chromosome conformation capture assays, interaction frequen- 
cies decrease with genomic distance between regions’. To adjust for this, 
we modelled the 4C signal decay as a function of distance using a mo- 
notonously decreasing smooth function"! (Extended Data Fig. 1b). Sub- 
tracting this trend, the residual interaction signal was converted to 
z-scores and interacting regions defined by merging neighbouring 
high-scoring fragments within 1 kb. Using this stringent approach, 
4,247 high-confidence interactions were identified across all viewpoints 
and conditions, representing 1,036 unique interacting regions (Sup- 
plementary Table 2). 

Each enhancer (viewpoint) interacted with, on average, ten distinct 
genomic regions (Extended Data Fig. 2a), less than half (41%) of which 
were annotated enhancers or promoters. Distal enhancers had a higher 
than expected interaction frequency with other enhancers (Extended 
Data Fig. 2b, P = 2.4 X 10°). Similarly, promoter-proximal elements 
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Figure 1 | A high-resolution view of enhancer interactions during 
Drosophila embryogenesis. a, 4C interaction map (viewpoint, red arrowhead) 
at the ap locus. The expected interaction with the promoter (blue arrowhead) of 
ap is observed. Known enhancers are indicated. b, Expression (in situ 
hybridization) of the ap gene (red) and expression driven by its interacting 
enhancer (GFP, green) at stage 11. MESO, mesoderm. 
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had extensive interactions with distal active promoters, 98% of which 
are >10 kb away (Extended Data Fig. 2b, c, P = 6 X 10 *). Enhancer- 
promoter interactions, although not significantly enriched, involve active 
promoters, with high enrichment for H3K27ac and H3K4me3, and 
active enhancers, defined by H3K27ac, RNA Pol II and H3K79me3 
(ref. 6) (Extended Data Fig. 2d, e). In contrast, contacts at inactive pro- 
moters are significantly depleted (Extended Data Fig. 2b). These results 
are similar to recent findings in human cells'*”* and the mouse B-globin 
locus'”, indicating similarities in 3D regulatory principles from flies to 
humans. 

The extent of 3D connectivity is surprising given the relative sim- 
plicity of the Drosophila genome. On average, each promoter-proximal 
element interacted with four distal promoters and two annotated en- 
hancers, whereas each distal enhancer interacted with two promoters 
and three other enhancers. These numbers are probably underestimates, 
as 60% of interactions involved intragenic or intergenic fragments con- 
taining no annotated cis-regulatory elements. Despite this, the level of 
connectivity is similar to that recently observed in humans, where active 
promoters contacted on average 4.75 enhancers and 25% of enhancers 
interacted with two or more promoters’*. The multi-component con- 
tacts that we observe for Drosophila enhancers indicate topologically 
complex structures and suggest that, despite its non-coding genome 
being an order of magnitude smaller than humans, Drosophila may re- 
quire a similar 3D spatial organization to ensure functionality. 

Insulators, and associated proteins, are thought to have a major role 
in shaping nuclear architecture by anchoring enhancer-promoter inter- 
actions or by acting as boundary elements between topologically assoc- 
iated domains (TADs)'*"'°. Occupancy data from 0 to 12 h Drosophila 
embryos” revealed a 50% overlap of interacting regions with occu- 
pancy of one or more insulator protein. Insulator-bound interactions 
are enriched in enhancer elements, suggesting that insulators may have 
a role in promoting enhancer-enhancer interactions (Extended Data 
Fig. 3a—d). In contrast to mammalian cells'®, we observed no associa- 
tion between insulator occupancy and the genomic distance spanned by 
chromatin loops, although there was a modest increase in average inter- 
action strength (Extended Data Fig. 3e, f). Conversely, 50% of inter- 
acting regions are not bound by any of the six Drosophila insulator 
proteins (Extended Data Fig. 3a, g), suggesting that these 3D contacts 
are formed in an insulator-independent manner, or are being facilitated 
by neighbouring interacting regions. 

If enhancer 3D contacts are involved in transcriptional regulation, 
then genes linked by interactions with a common enhancer should share 
spatio-temporal expression, as recently reported'*°. For the four loci 
examined—pdm2 (Extended Data Fig. 4a, b), engrailed (en; Extended 
Data Fig. 4c, d), unc-5 (Extended Data Fig. 5c, d) and charybde (Fig. 2c, 
d, described below)—this is indeed the case. For example, the pdm2 
CE8012 enhancer interacts with both the pdm2 and nubbin (nub, also 
known as pdm1) promoters, located 2.5 and 47 kb away, respectively. 
Both genes, producing structurally related proteins, are co-expressed 
in the ectoderm, overlapping the activity of the pdm2 enhancer. 

Although there are examples of long-range interactions in Drosophila, 
often involving Polycomb response elements (PREs)***’” and insulator 
elements”, the vast majority of characterized enhancers are within 10 kb 
of their target gene, with few known to act over 50 kb (Fig. 2a and 
Supplementary Table 3). However, as investigators historically tested 
regions close to the gene of interest, characterized Drosophila enhan- 
cers are generally close to the gene they regulate. In contrast, although 4C 
cannot assess the full extent of short-range interactions (Extended Data 
Fig. 5a, b), it provides an unbiased systematic measurement of the dis- 
tance of enhancer interactions, far beyond 10 kb. 

The distance distribution of all significant interactions reveals exten- 
sive long-range interactions within the ~ 180 megabase (Mb) Drosophila 
genome; 73% span >50 kb, with the median interaction-viewpoint dis- 
tance being 110 kb (Fig. 2a, b). Two striking examples of long-range 
interactions are the unc-5 and charybde loci. The unc-5 promoter inter- 
acts with multiple regions, including a weak but significant interaction 
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Figure 2 | Long-range interactions are widespread in the Drosophila 
genome. a, b, Distance distribution of all known Drosophila enhancers (red; 
literature based) and identified significant interactions (blue, n = 1,983) to their 
respective target gene or viewpoint, shown as a histogram (a) and box plot 
(b). c, 4C interaction map (viewpoint, red arrowhead) around the scyl and chrb 
loci. WE, whole embryo. Interaction with the scyl gene is highlighted (blue 
arrowhead). Significant 4C interactions, known enhancers and DNA FISH 
probes are indicated. d, Expression (in situ hybridization) of chrb (red) and scyl 
(green) at stage 11. e, DNA FISH images of representative nuclei, merging 
DAPI (blue), probe A (red) and probe B, C, A’ or D (green) channels. Scale bar, 
1 um. Density plot indicates measured distances between probe A and probe A’ 
(grey), D (beige) B (blue) or C (green). 


with the promoter of slit (sli), at a distance of >500 kb (Extended Data 
Fig. 5c, d). These genes produce structurally unrelated proteins that are 
co-expressed in the heart, and are essential for heart formation. 

A promoter-proximal element near the charybde (chrb) promoter has 
a strong interaction with the promoter of the scylla (scyl) gene, almost 
250 kb away (Fig. 2c). Both genes are closely related in sequence and co- 
expressed throughout embryogenesis (Fig. 2d)”*. These long-range inter- 
actions were confirmed by reciprocal 4C, using either the promoter of 
chrb or scyl, or an interacting putative enhancer as viewpoint (Extended 
Data Fig. 5e). We further verified this interaction using DNA fluor- 
escence in situ hybridization (FISH) in embryos (Fig. 2e). As a control, 
we assessed the distance between the chrb promoter (probe A) and an 
overlapping probe A’ or a region on another chromosome (probe D), 
to determine the distances between regions very close or far away, re- 
spectively. Comparing the distance between the chrb and scyl promoters 
(probes A and B, Fig. 2c) showed a high, statistically significant co- 
localization (Fig. 2e; 37% co-localization; P< 10 +8; Extended Data 
Fig. 5f), in contrast to the distance between the chrb promoter and a 
non-interacting region with equal genomic distance (probes A and C; 
5% co-localization). 

The reciprocal 4C revealed several intervening interactions that are 
consistently associated with loops to both the scyl and chrb promoter. 
We examined the activity of two of these in transgenic embryos. Both 
interacting regions can function as enhancers in vivo, recapitulating 
chrb expression in the visceral mesoderm (enhancer 1) and nervous 
system (enhancer 2) (Extended Data Fig. 5e, g). 
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When considering a 1-Mb scale around this region, the 4C inter- 
action signal drops to almost zero just after the promoters of both genes 
(Extended Data Fig. 6a). This “contained block’ of interactions is remi- 
niscent of TADs™, although the boundaries don’t exactly match TADs 
defined at late stages of embryogenesis’*, which may reflect differences 
in the developmental stages used. However, the boundaries do overlap 
a block of conserved microsynteny between drosophilids* spanning 
~50 million years of evolution (Extended Data Fig. 6a), suggesting a 
functional explanation underlying the maintained synteny. Expanding 
this analysis across all viewpoints, ~60% of interactions are located with- 
in the same TAD and the same microsyntenic domain as the viewpoint 
(Extended Data Fig. 6b, c). In the case of the chrb and scyl genes, this 
constraint may act to maintain a regulatory association between a 
large array of enhancers, facilitating their interaction with both genes’ 
promoters. 

These examples, and the other 555 unique interactions >100 kb, 
provide strong evidence that long-range interactions are widely used 
within the Drosophila genome, potentially markedly increasing the reg- 
ulatory repertoire of each gene. 

As enhancer-promoter looping can trigger gene expression”, it fol- 
lows that enhancer contacts should reflect the dynamics of transcriptional 
changes during development and therefore be temporally associated with 
gene expression. To assess this, we directly compared looping interactions 
between the two different time points and tissue contexts. Given the non- 
discrete nature of chromatin contacts, we used the quantitative 4C-seq 
signal to identify differential interactions based on a Gamma-Poisson 
model and defined them as having >2-fold change and false discovery 
rate =10%. 

Despite the marked differences in development and enhancer activity 
between these conditions, we found surprisingly few changes in chro- 
matin interaction frequencies, with ~6% of interacting fragments show- 
ing significant changes between conditions (Extended Data Fig. 7; Fig. 3a 
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and Extended Data Fig. 8a, red dots). Of these, 87 interactions were sig- 
nificantly reduced during mid-embryogenesis (6-8 h) compared to the 
early time point (3-4h), and 90 interactions significantly increased. 
Similarly, 105 interactions had a higher frequency in mesodermal cells, 
compared to the whole embryo, and 34 interactions were lower. 

For example, a promoter-proximal viewpoint in the vicinity of the 
Antp promoter identified many interactions, two of which are signifi- 
cantly decreased at 6-8h, although the expression of the Antp gene 
itself increases (Extended Data Fig. 8b). For one region, the reduction 
in 4C interaction at 6-8 h corresponds to a loss in a H3K4me3 peak 
from 3-4h to 6-8 h (asterisk), suggesting that this 3D contact is assoc- 
iated with the transient expression of an unannotated transcript. We 
examined the activity of the other interacting peak in transgenic embryos 
and showed that it acts as an enhancer, driving specific expression in 
the nervous system overlapping the Antp gene at 6-8 h (Extended Data 
Fig. 8c). Along with the two enhancers discovered at the chrb locus, this 
demonstrates the value of 3D interactions to identify new enhancer 
elements, even for well-characterized loci like Antp. 

A viewpoint in the vicinity of the Abd-B promoter interacted with a 
number of regions spanning the bithorax locus, three of which corre- 
spond to previously characterized Abd-B enhancers; iab-5 (ref. 26), 
iab-7 and iab-8 (refs 26, 27) (Fig. 3b, c). The iab-7 and iab-8 enhancers 
are active in early embryogenesis, and have much reduced or no activ- 
ity at the later time point”®”’”. Notably, although the loop to those two 
enhancers is strong at the early time point, it becomes significantly 
reduced later in development, when both enhancers’ activities are reduced. 
Conversely, the iab-5 enhancer contacts the promoter at a much higher 
frequency later in development, at the stage when the enhancer is most 
active**’’. This locus therefore exhibits dynamic 3D promoter-enhancer 
contacts that reflect the transient activity of three developmental en- 
hancers. It is interesting to note that in all loci examined, the dynamic 
contacts of specific elements are neighboured by stable contacts, as seen 
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Figure 3 | Specific loci display localized differential interactions. a, MA plot 
of interaction signal between whole embryo 6-8 h and whole embryo 3-4 h 
(significant differential interactions, red dots). WE, whole embryo. b, 4C 
interaction map at the Abd-b locus. Top to bottom: RNA-seq signal (reads per 
kilobase per million mapped reads (RPKM), black) in whole embryo at 2-4h 
and 6-8 h (ref. 9), 4C interaction map (viewpoint, red arrowhead) in whole 
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embryo at 3-4 h (mauve) and 6-8 h (blue), and differential 4C signal (red) with 
significant differential 4C interactions (asterisk). Insets show the expression 
(in situ hybridization) of Abd-B (red) at stages 5 (2-4h) and 11 (6-8h). 

c, Expression (in situ hybridization) driven by iab5 (ref. 26), iab7 and iab8 
(ref. 27) enhancers at stages 5 and 10-11. Embryo images in panel c reproduced 
with permission from: ref. 27, Development; ref. 26, Nature Publishing Group. 
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in the Antp and Abd-B loci. Dynamic changes, therefore, appear to oper- 
ate in the context of larger, more-stable 3D landscapes. 

Ninety-four per cent of enhancer interactions showed no evidence of 
dynamic changes across time and tissue context, which is remarkable 
given the marked developmental transitions during these stages (Fig. 3a, 
Extended Data Fig. 8a and Supplementary Table 4). To investigate this 
further, we examined enhancer-promoter interactions of genes switch- 
ing their expression state between time points or tissue contexts. The ap 
gene, for example, is not expressed at 2-4h but is highly expressed 
during mid-embryogenesis (6-8 h) (Fig. 4a). Despite the absence of ex- 
pression, the interaction between the apME680 enhancer and the ap 
promoter is already present at 3-4h, several hours before the gene’s 
activation (Fig. 4a). To examine this more globally, we selected differ- 
entially expressed genes, going either from on-to-off or off-to-on (Ex- 
tended Data Fig. 9). Even for these dynamically expressed genes, there 
was no correlation with changes in their promoter-enhancer contacts 
(Fig. 4b). We observe similar ‘stable’ interactions between tissue con- 
texts. Genes predominantly expressed in the neuroectoderm at 6-8 h, 
for example, have interactions at the same locations in whole embryos 
and purified mesodermal nuclei at 6-8 h, despite the fact that they are 
not expressed in the mesoderm at this stage (Extended Data Fig. 8d-g). 

Pre-existing loops were recently observed in human and mouse cells, 
and suggested to prime a locus for transcriptional activation*’’. How- 
ever, why they are formed and how transcription is eventually trig- 
gered remains unclear. To investigate this, we focused on the subset of 
genes that have both off-to-on expression and no evidence for differ- 
ential interactions (20 genes; differentially expressed with stable loops 
(DS) genes; Supplementary Table 5 and Extended Data Fig. 9). Despite 
changes in their overall expression, DS genes have similar levels of RNA 
polymerase II (Pol II) promoter occupancy at both time points (Fig. 4c). 
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Figure 4 | Interactions are stable across developmental time and associated 
with paused polymerase. a, 4C interaction map at the ap locus. Top to bottom: 
Pol II signal (reads per genomic content (RPGC)) in whole embryo at 2-4h 
(orange), GRO-seq signal in whole embryo at 2-2.5h (plus strand, red; minus 
strand, blue)’*, RNA-seq signal (RPKM) in whole embryo at 2-4h and 6-8h 
(black)’, 4C interaction map (viewpoint, red arrowhead) in whole embryo at 
3-4 h (mauve) and 6-8h (blue), and differential 4C signal (red). WE, whole 
embryo. Significant 4C interactions and known enhancers are indicated. Insets 
show the expression (in situ hybridization) of ap at stages 6 (3-4h) and 11 
(6-8 h). b, Differentially expressed genes, going from off-to-on (nm = 21) or 
on-to-off (n = 8), have no significant differences in the frequency of 4C 
interactions at their promoter (two-sided Wilcoxon test). ¢, Pol II signal 
(RPGC) is enriched at the promoter of differential genes with stable 
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The presence of promoter-bound Pol II in the absence of full-length 
transcription is indicative of Pol II pausing. Using global run-on se- 
quencing (GRO-seq) data** to define a stringent set of paused genes, 
we observed that most (75%) DS genes are paused (15 of 20 DS genes; 
Fig. 4d and Extended Data Fig. 9b, d), and have a significantly higher 
pausing index (Fig. 4e). This percentage is significantly higher than 
expected by chance when sampling over all off-to-on genes (Fig. 4d), 
and is robust to using a strict (Fig. 4d) or more relaxed (Extended Data 
Fig. 9e) definition of Pol II pausing”. This association is very evident 
when examining specific loci (Fig. 4a and Extended Data Fig. 10), show- 
ing Pol II occupancy, short abortive transcripts, and loop formation 
before the gene’s expression. Taken together, these results indicate that 
‘stable’ chromatin loops are associated with the presence of paused Pol 
II at the promoter. 

To understand how transcription is ultimately activated, we exam- 
ined changes in DNase I hypersensitivity”’ at the promoter of DS genes. 
DNase I hypersensitivity is significantly increased at interacting pro- 
moters at the stages when the gene is expressed (Fig. 4f), suggesting that 
the recruitment of additional transcription factor(s) later in develop- 
ment might act as the trigger for transcriptional activation. 

In summary, our data reveals extensive long-range interactions in an 
organism with a relatively compact genome, including pairs of co-regulated 
genes contacting common enhancers often at distances greater than 200 kb. 
Comparing enhancer contacts in different contexts revealed that chro- 
matin interactions are very similar across developmental time points 
and tissue contexts. Enhancers therefore do not appear to undergo long- 
range looping de novo at the time of gene expression, but are rather 
already in close proximity to the promoter they will regulate. Within 
this 3D topology, highly dynamic and transient contacts would not be 
visible when averaging over millions of nuclei. As transcription factor 
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interactions (DS genes) at 2-4h, even though the genes are not expressed. 

d, Expected distribution of paused genes (using top 50% of paused genes”) by 
sampling 10,000 times differential off-to-on genes (red dotted line, observed 
percentage of paused DS genes). P value is indicated. e, log, pausing index in 
whole embryo at 2-2.5h (ref. 28) of DS genes (n = 19) is significantly different 
from all paused genes” (n = 7,734; two-sided Wilcoxon test). f, Box plot 
showing differential DNase I hypersensitivity (log, fold change stage 10/stage 
5) at the promoter of DS genes (n = 18) and all Drosophila mRNA genes 

(n = 10,409, see Methods). P value from a two-sided Wilcoxon test. Boxes 
depict the interquartile range (IQR) with the median as a horizontal thick 
line. Upper and lower whiskers extend to 1.5 times the IQR, and points 
represent outliers. 


7 AUGUST 2014 | VOL 512 | NATURE | 99 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


binding is sufficient to force loop formation”, our results suggest a model 
where through transcription factor-enhancer occupancy, an enhancer 
loops towards the promoter and polymerase is recruited, but paused 
in the majority of cases. The subsequent recruitment of transcription 
factor(s) or additional enhancers at preformed 3D hubs most likely 
triggers activation by releasing Pol II pausing. Such preformed topol- 
ogies could thereby promote rapid activation of transcription**”’. At the 
same time, as paused promoters can exert enhancer-blocking activity”, 
the presence of paused polymerase within these 3D landscapes could 
safeguard against premature transcriptional activation, but yet keep 
the system poised for activation. 


METHODS SUMMARY 


Staged Drosophila embryos were collected at 3-4 h or 6-8 h after egg lay and fixed 
in 1.8% formaldehyde for 15 min at room temperature. Thirty million nuclei were used 
for each 4C template preparation, enough for on average ten viewpoints. Libraries were 
amplified from 320 ng of 4C template (primer sequence in Supplementary Table 6), 
and 100 multiplexed libraries were sequenced over on average five HiSeq2000 lanes, 
using 100-base-pair (bp) single-end reads. Two independent biological replicates 
were analysed for each condition. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Visualizing the kinetic power stroke that drives 
proton-coupled zinc(i) transport 


Sayan Gupta't, Jin Chai’, Jie Cheng®, Rhijuta D’Mello!, Mark R. Chance! & Dax Fu?” 


The proton gradientisa principal energy source for respiration-dependent 
active transport, but thestructural mechanisms of proton-coupled trans- 
port processes are poorly understood. YiiP is a proton-coupled zinc 
transporter found in the cytoplasmic membrane of Escherichia coli. Its 
transport site receives protons from water molecules that gain access to 
its hydrophobic environment and transduces the energy of an inward 
proton gradient to drive Zn(11) efflux’”. This membrane protein is a well- 
characterized member’” of the family of cation diffusion facilitators that 
occurs at all phylogenetic levels*"°. Here we show, using X-ray-mediated 
hydroxyl radical labelling of YiiP and mass spectrometry, that Zn(1) 
binding triggers a highly localized, all-or-nothing change of water access- 
ibility to the transport site and an adjacent hydrophobic gate. Milli- 
second time-resolved dynamics reveal a concerted and reciprocal pattern 
of accessibility changes along a transmembrane helix, suggesting a rigid- 
body helical re-orientation linked to Zn(m1) binding that triggers the 
closing of the hydrophobic gate. The gated water access to the transport 
site enables a stationary proton gradient to facilitate the conversion of 
zinc-binding energy to the kinetic power stroke of a vectorial zinc trans- 
port. The kinetic details provide energeticinsights into a proton-coupled 
active-transport reaction. 

Mammalian homologues of YiiP are responsible for zinc sequestration 
into secretory vesicles, thus playing important roles in neurotransmission” 
and hormone secretion’’. Zinc efflux catalysed by YiiP is coupled with 
proton influx in a 1:1 zinc-for-proton exchange stoichiometry’. When 
protons are scarce at higher pH, zinc transport comes to a halt despite a 
large zinc concentration gradient’. Thus, the zinc-for-proton coupling is 
obligatory. Biochemical studies and X-ray structures of YiiP showed that 
zinc transport is mediated by a tetrahedral Zn(11)-binding site in the centre 
of the transmembrane domain (TMD)'. This intramembranous zinc- 
transport site adopts coordination geometry satisfied by three Asp and 
one His residues, but lacks any additional polar or charged residues in 
the Zn(1)-binding pocket. The absence of available pH titratable residues in 
the second coordination sphere necessitates water access to fulfil proton 
donor or acceptor functions to enable the obligatory zinc-for-proton 
exchange. However, the crystal structure of zinc-bound YiiP (zinc-YiiP) 
shows that water access to the transport site is blocked by hydrophobic 
residues that divide the zinc translocation pathway into an extracellular and 
intracellular cavity'*. A protein conformational change is expected to open 
up a water portal within the hydrophobic seal. As water molecules gain 
access to the transport site in a transport reaction cycle, irradiating YiiP toa 
millisecond synchrotron X-ray pulse could render residues in contact with 
waters susceptible to hydroxyl-radical-mediated oxidative modification, 
thereby permitting the monitoring of residues motions in terms of water 
accessibility change’*"*. Radiolytic hydroxyl radicals under such experi- 
mental conditions are generated rapidly and isotropically both in bulk 
and activated bound waters with side-chain oxidation completed within 
milliseconds'*"*. By comparison, the macroscopic timescale for zinc trans- 
port is of the order of 200-500 ms*”. Thus, time-resolved hydroxyl radical 
‘footprinting’ would have a sufficient time resolution to monitor proton 
translocation and associated protein conformational change. 


Purified YiiP in detergent micelles was exposed to a focused synchro- 
tron white beam, followed by a rapid mix with methionine-amide to 
quench secondary radical chain reactions (Extended Data Fig. la). The 
effective hydroxyl radical concentration was controlled in the micromolar 
range as indicated by an Alexa Fluor 488 dosimeter, and secondary radi- 
ation damage of YiiP was minimized by adjusting the X-ray irradiation to 
an optimal dose range'*"®. As a result, only negligible differences in size- 
exclusion high-performance liquid chromatography (HPLC) profiles were 
observed for the protein peaks before and after X-ray irradiation (Fig. 1a). The 
broad low molecular peak in zinc-YiiP (red trace) corresponded to the methio- 
nine-amide quencher added to the apo-YiiP sample after irradiation. The sites of 
oxidative modification were characterized by +14, +16 and +32 dalton (Da) 
oxygen-based mass adducts'*’*"*, which were detected by bottom-up liquid 
chromatography—mass spectrometry (LC-MS) of proteolytic fragments of the 
irradiated YiiP (Fig. 1b), and confirmed by tandem mass spectrometry (MS/MS) 
assignments (Fig. 1c). The overall mass spectrometric sequence coverage was 
82% (Extended Data Fig. 2a), encompassing all residues located within the inter- 
cavity seal (Extended Data Fig. 2b). Increasing X-ray irradiation progressively 
increased the modified and reduced the unmodified populations, giving rise toa 
dose-response plot for each modified site (Fig. 1dand Extended Data Fig. 3). The 
initial phase of the dose-response plot followed a pseudo-first-order reaction, but 
occasional deviations from the exponential function were observed at increased 
irradiation times as a result of secondary modifications (Fig. 1d and Extended 
Data Fig, 3). Therefore, the slope of the initial phase was used to quantify the 
hydroxyl radical reactivity (Extended Data Table 1). 

The rate of side-chain labelling is governed by intrinsic reactivity of the 
amino acid and water accessibility to the side chain'*”’. The ratio of the 
measured reactivity rates for the same residue from zinc-YiiP and apo- 
YiiP gave a ratiometric account of the water accessibility change inde- 
pendent of the intrinsic side-chain reactivity or sequence context. Among 
all the detectable sites of modification, two sites exhibited conspicuously 
large differences in reactivity in the presence and absence of zinc (Fig. 2a). 
One instance where Zn(11) binding reduced reactivity more than 1000-fold 
was observed for three consecutive residues, V48, D49 and 150, within the 
peptide LVDI of TM2 (Extended Data Table 1). In the crystal structure of 
zinc-YiiP (Protein Data Bank accession number 3H90), D49 binds Zn(i1) 
in the transport site and is one helical turn away from a structural water 
that is immobilized via a hydrogen bond to $53 (3.1 Ato Oy) (Fig. 3a). 
Coordination of Zn(m) to the transport site may suppress productive 
radiolysis of this structural water, resulting in a negligible rate of VDI 
labelling in zinc-YiiP (Extended Data Table 1). In sharp contrast, the 
absence of a coordinated Zn(1) in apo-YiiP permitted an unusually fast 
radiolytic labelling at 163 s_' (Extended Data Table 1). Such a high level of 
reactivity has been observed for radiolytic labelling by structural water 
molecules in the hydrophobic core of a G-protein-coupled receptor”. 

A second very significant change was observed for a +14 Da modifica- 
tion of L152 in the peptide ADMLHY of TMS (Fig. 2a). In apo-YiiP, the 
rate of +14 Da modification for L152 was 8.5” (Extended Data Table 1) 
while Zn(1) binding reduced the reactivity of L152 more than 100-fold, toa 
negligible level, illustrating that a Zn(11)-binding-induced conformational 
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Figure 1 | Radiolytic labelling and mass spectrometric analysis. a, Size- 
exclusion HPLC chromatograms of apo-YiiP before irradiation and zinc-YiiP after 
irradiation. b, Examples for quantification of radiolytic labelling by LC-MS; 
extracted ion-count chromatograms of singly protonated, unmodified (749.33 m/z, 
black), carbonylated (+14 Da mass shift, 763.33 m/z, red) and hydroxylated 
(+16 Da mass sift, 765.33 m/z, blue) peptide ADMLHY. c, Examples for 
identification of modified residues by MS/MS of the carbonylated and hydroxylated 
ADMLHY with peak assignments (red arrow and blue line) confirming L152 and 
M151 modification, respectively. d, Dose-responses showing reciprocal solvent 
accessibility changes at L152 and M151 sites in apo-YiiP (red) and zinc-YiiP 
(black). Solid lines, least-squares fits of the means of dose-dependent data; error bar, 
s.e.m. from four to six independent measurements. 


change removed water access to the side chain of L152. This peptide 
contained another labelled residue, M151, whose +16 Da modified pro- 
ducts could be isolated from those of L152 on the basis of the difference in 
the mass to charge ratio (m/z) (Fig. 1b, c). The same conformational change 
that reduced reactivity of L152 yielded a 2.5-fold increase in reactivity for 
the neighbouring M151 (Figs 1d and 2a). In the zinc-YiiP structure, L152 is 
fully buried and oriented towards the intracellular cavity as a part of the 
inter-cavity seal, consistent with the lack of radiolytic labelling (Fig. 3a). 
L152 is located at the interface between a TM3-TM6 helix pair and a 
compact TM1-TM2-TM4-TM5 four-helix bundle (Fig. 3b). These two 
subdomains cross over to form two cavities located on either side of the 
membrane as indicated by arrows in Fig. 3b. The inter-domain packing 
wedges TM5 (coloured in red) at one corner of the four-helix bundle 
into the TM3-TM6 interface with L152 situated at the centre of the 
TM5—TM3-TM6 triple-helix joint (Fig. 3b). L152 interacts with 190 
from TM3 and A194 from TM6 to form a tight knob-into-hole packing. 
The 190 equivalents have been identified as metal determinant residues in 
plant and yeast cation diffusion facilitator (CDF) homologues’*”’. One 
helical turn down towards the intracellular cavity is another layer of 
residue triad: A83 from TM3, A149 from TM5 and M197 from TM6, 
which define the innermost section of the intracellular cavity (Fig. 3b). Of 
note, the conformational changes of M197 echo those of L152, with a zinc- 
dependent reduction of solvent accessibility (Fig. 2a) except that the 
accessibility of M197 is not reduced to the background level in the zinc- 
bound state (Extended Data Fig. 3). This cluster of six residues forms a 
highly conserved TM5—>TM3-TM6 packing core (Extended Data Fig. 4), 
with L152 serving as a principal hydrophobic barrier between the two 
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Figure 2 | Quantification of water accessibility changes. a, Water accessibility 
changes in response to Zn(1) binding measured by the ratio of labelling rates for 
residues with an increase (blue), decrease (red) or no change (grey) in water 
accessibility after a rapid Zn(11) exposure. The labelling rate for each site as indicated 
is summarized in Extended Data Table 1, and the error bar represents the standard 
error from four to six independent measurements. b, Residues with a partial water 
accessibility change in response to zinc binding. TMS is coloured in red. Z1 and 
Z2 (magenta spheres) represent bound zinc ions. Arrow indicates a putative zinc- 
transport pathway from the cytoplasm through the L152 gate to the transport site. 


cavities (Fig. 3c, d). The structural and functional importance of L152 was 
examined by a series of point mutations (Extended Data Fig. 5). All L152 
mutants expressed well. However, substitutions of L152 with smaller (G, 
A), bulky aromatic (F) and charged residues (D, R) resulted in complete 
denaturation after the mutant proteins were solubilized by DDM, whereas 
conserved L152 substitutions with I and M residues were partly tolerated 


» Extracellular 
s 


Intracellular 


DEG 


Figure 3 | L152 controls the opening of an inter-cavity water portal. a, A 
structural water molecule (W, red sphere) near the transport site occupied by a 
tetrahedral coordinated Zn(11) (Z1, magenta sphere), viewed from the periplasm. 
Relevant residues are drawn in sticks and labelled accordingly. TMS is coloured in 
red as indicated. b, Intracellular and extracellular cavity as outlined by dashed lines. 
c, L152 gate viewed from the extracellular cavity along the arrow as indicated in 
b. The side chains of L152, 190 and the coordination residues in the transport site 
(sticks) are excluded from the protein surface drawing. M197 is shown as a yellow 
patch at the cytoplasmic entrance to the inter-cavity portal. d, L152 gate viewed 
from the intracellular cavity along the arrow as indicated in b. M151 and M197 are 
visible as yellow patches on the protein surface. 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


4 iof — “oms b 1.0 4 10 M189/M160 
=—— ons AN 2 k=2.0+08s7 
# 0.55 — 60ms 7 0.95 0.9 
2 a 0.90 
2 3B” vas/D49/I50 | 08 
xs S ossp k=1.720.5s"4 
s : 
Qa 0.2 2 0.0 25 50 7.5 10.0 O75 25 50 7.5 10.0 00 25 50 7.5 10.0 
oO = * 
2 =) 1.4 10k M262 
@ xo] 4 
% k=14.3 416.2 
® NY ica 0.8 
0.1 = tol? $ 
‘= 0.6 
one M197 08 W225 
e Zz k=1.7+40.5s" | 96 0.47% 
4 0.92 
34 0.0 25 50 7.5 10.0 0.0 25 5.0 00 25 50 7.5 10.0 


Retention time (min) 


Figure 4 | Kinetics of water accessibility changes. a, An example of extracted 
ion chromatograms from the unmodified and modified M197 in peptide 197- 
208. The data were smoothed by a low-pass filter and normalized to the peak 
height of respective unmodified species. The arrow indicates a progressive 
decrease of the modified peaks as a function of the reaction time. b, Time 


(Extended Data Fig. 5). The side-chain-dependent effects of L152 muta- 
tions on protein stability are consistent with a critical structural role for 
L152 in the highly conserved TM5—>TM3-TM6 packing core. 

Among all the detectable sites, only the transport site and its neighbour- 
ing L152 gate exhibited all-or-nothing water accessibility changes (Figs 1d 
and 2a and Extended Data Fig. 3), suggesting a tight control of water 
leakage across the membrane. Within the TMD, oxidative modifications 
were observed at four reactive Met residues outside the transport site and 
the L152 gate. As noted above, M197 at the intracellular entrance to the 
L152 gate (Fig. 2b) showed a 70% reduction in water accessibility upon 
Zn(11) binding (Fig. 2a) whereas M151, M159 and M160 at the amino (N) 
and carboxy (C) termini of TM5 (Fig. 2b) showed a 50-130% increase in 
water accessibility (Fig. 2a and Extended Data Table 1). These last three Met 
residues reside on a helical face of TM5 with increased water accessibility 
upon Zn(11) binding (Fig. 2b). By contrast, residues with a reduction of 
water accessibility upon Zn(i1) binding are either located on the opposite 
TMS face (for example, L152) or packed against the opposite TM5 face 
(V48 and M197) (Fig. 2b). The tetrahedral transport site (H153, D157, D45 
and D439) is also located on the same TM5 face with a zinc-dependent 
loss of water accessibility. The reciprocal change in water accessibility on 
two opposite TM5 faces is consistent with re-orientation of TMS in res- 
ponse to Zn(1) binding. Furthermore, solvent-accessible residues in apo- 
YiiP were found to line a putative transmembrane zinc pathway, starting 
from M197 in the intracellular cavity, through L152 within the inter-cavity 
seal and arriving at H153, V48 and D49 in the extracellular cavity (Fig. 2b). 
This finding of a well-defined channel from the transport site in apo- 
YiiP to the intracellular cavity is in agreement with an inward-facing 


Inward-facing 
(Cryo-EM structure) 


Ht 


L152 gate open 


Reaction time (s) 


courses of water accessibility change for indicated residues. The solid line 
represents a single exponential fit of the time course of the unmodified fraction 
with a fitted rate constant (k) presented as mean = s.e.m. from six independent 
measurements. 


conformation revealed by an electron crystallographic structure of an 
apo-YiiP homologue”. 

To understand the structural dynamics of the Zn-dependent closing of 
the inter-cavity portal, we monitored the time course of radiolytic modi- 
fication upon rapid mixing of apo-YiiP and 0.2mM ZnCl, (Extended 
Data Fig. 1b). Only highly reactive residues could be detected with a 
sufficient signal-to-noise ratio for quantitative kinetic analysis (Fig. 4a). 
In the TMD, time-resolved measurements were performed on four Met 
residues (M151, M159/M160 and M197) and the V48/D49/150 peptide. 
After mixing of apo-YiiP and Zn(n), the exponential increases in unmodi- 
fied V48/D49/150 and M197 residues, indicative of the closing of the inter- 
cavity portal, mirrored the exponential falls in unmodified M151 and 
M159/M160 residues (Fig. 4b). The rates of reciprocal water accessibility 
changes for these four positions on opposite faces of TM5 were identical 
within experimental errors, suggesting that TM5 underwent a rigid-body 
re-orientation upon zinc binding. The rigid-body motion of TM5 pre- 
dicted a similar rate of L152 motion. Averaging the rates of four detectable 
sites gave an overall rate of TM5 motion at 1.8 + 0.7s__', approximating 
the macroscopic transport rate (2-5s_') determined by stopped-flow 
Zn(1) flux measurements*”. Thus, zinc access to the transport site and 
the ensuing TM5 motion linked to the closing of the L152 gate occur on 
the same time scale. As an internal control, time-resolved measurement 
showed no changes in labelling to W225 on the cytoplasmic domain 
(CTD) surface (Fig. 4b). However, another surface residue, M262, exhib- 
ited a rapid change at 14.3 ' (Fig. 4b). The marked kinetic difference 
suggested that the observed water accessibility change to M262 preceded 
TM5 motion, but its functional relevance is unclear. 
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Figure 5 | Schematic representation of zinc-for-proton exchange. The 
representation is based on two existing structural models with the L152 gate 
open or closed as indicated. The protein conformational change alternates the 


membrane-facing on-off mode of zinc coordination and protonation- 
deprotonation of the transport site in a coordinated fashion. 
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The time-resolved data suggested that Zn(m1) binding triggers a concer- 
ted rigid-body motion of TM5 that swings L152 into place to plug the inter- 
cavity seal (Fig. 5). Since the four coordination residues of the transport site 
are projected from TM2 and TMS (Fig. 3a), a TM5 motion is expected to 
alter the TM2-TM5S inter-helix orientation that determines the coordina- 
tion geometry of the transport site’. Thus, a rigid-body TM5 motion would 
simultaneously affect the mode of Zn(11) coordination and the gating of the 
inter-cavity portal through L152 movement (Fig. 5). As shown in Fig. 3c, d, 
the opening of the L152 gate would expose the transport site through a 
nanotube to the aqueous bulk of the intracellular cavity”. When a Zn(11) 
from the intracellular cavity reaches the transport site, the favourable match 
of its coordination chemistry with the tetrahedral transport site’ would 
release binding free energy in the confinement of the hydrophobic core 
where the free energy may be guided to trigger TM5 re-orientation. By 
analogy to the working of a combustion engine, the released zinc-binding 
energy is transformed to useful mechanical energy, providing the power 
stroke of TM5 re-orientation to close the L152 gate (Fig. 5). As a result, this 
conformational change alternatively exposes the transport site to intracel- 
lular and extracellular cavity. The in vivo transmembrane proton gradient 
ofan enteric bacterium E. coli is about one to two pH units™. The flipping of 
H153 as a part of the transport site to either side of the membrane with 
a physiological pH gradient is expected to change its protonation state. 
A deprotonated H153 facing a relatively alkaline cytosol would promote 
Zn(u) binding from the intracellular cavity whereas a protonated H153 
facing a relatively acidic periplasm may facilitate Zn(11) release into the 
extracellular cavity (Fig. 5). As such, an inward pH gradient drives a 
vectorial Zn(1) efflux in a 1:1 exchange stoichiometry. The dynamic details 
revealed in the present study explain how a physiological proton gradient, 
zinc coordination chemistry and water nanofluidics are orchestrated in a 
dynamic protein structure to overcome the activation barrier to Zn(u) 
efflux and promote a vectorial Zn(1) movement through an inter-cavity 
water portal that is highly conserved in the CDF protein family. 


METHODS SUMMARY 


YiiP was overexpressed and purified as described previously*. Prior to X-ray irra- 
diation, YiiP was de-metallized and then exchanged to a radiolytic labelling buffer 
(10 mM NaPi, pH 6.5, 100 mM NaCl, 0.02% DDM, 0.1 mM TCEP) by size-exclusion 
HPLC. ZnCl, was added to 0.1 mM to an aliquot of apo-YiiP to form zinc-YiiP. Apo- 
YiiP or zinc-YiiP at a concentration of 10 tM were exposed to an X-ray white beam 
at 4°C at beamline X28C of the National Synchrotron Light Source, Brookhaven 
National Laboratory, as described previously'*”*. Proteolytic cleavage of the irradiated 
samples used pepsin, trypsin or trypsin-chymotrypsin double digestion. A bottom-up 
proteomic analysis by reverse phase liquid chromatography interfaced to a Fourier 
transform mass spectrometer. The MS/MS data for peptides and their sites of mod- 
ifications were manually interpreted with the aid of proteomics software”. The peak 
area from the extracted ion chromatograms of a specific peptide fragment was used to 
quantify the amount of modification. The extent of modification versus the X-ray 
irradiation time was fitted to a single exponential function to determine the hydroxyl 
radical reactivity rate of the side chain. Time-resolved radiolysis was performed with a 
modified KinTek apparatus using a standard flow sequence’*”’. Apo-YiiP at a con- 
centration of 20 1M, and an equal volume of 0.2 mM ZnCl, were mixed by a T-mixer. 
After a designated delay time, the mixed sample was driven through an irradiation cell 
and then into a quenching tube. The extent of radiolytic modification was plotted 
against the reaction time and fitted with a single exponential function to determine the 
rate of water accessibility change. All data are presented as mean + s.e.m. based on 
three or more independent measurements. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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MATERIALS RESEARCH 


Batteries warm up 


Interest in energy-storage research is growing, opening up 
opportunities for chemists with interdisciplinary skills. 


BY KATHARINE GAMMON 


impetus for a change in career direction 
was a single sentence she heard in 2005. 
Then a doctoral student in chemistry at the Uni- 
versity of California, Berkeley, she was attending 
atalk about the potential of powering the world 
with solar energy. The speaker made an offhand 
comment — that energy storage is part of the 
equation for the future of green energy — and 
Trahey’s interest was piqued. “I took that and 
ran with it — he’s right. We will need to store 
energy from all kinds of sources. Batteries have 
to be part of the innovation,” she says. 
Trahey completed her studies in 2007 and 
went hunting for a postdoctoral position 
that would satisfy her interest in lithium-ion 


F: materials scientist Lynn Trahey, the 


batteries, the most common rechargeable bat- 
tery used in electronics and cars. She landed a 
joint postdoc in 2008 at two institutes in Illinois 
— Argonne National Laboratory near Chicago 
and Northwestern University in Evanston — 
then nabbed a permanent position as an assis- 
tant materials scientist at the lab two years later. 

Today, she is researching materials and reac- 
tions in high-energy-density rechargeable cells 
with the aim of making longer-lasting batteries. 
Specifically, she wants to learn why the devices 
degrade where their liquid and solid compo- 
nents connect. She is also working to develop 
anodes — where positive current flows into the 
battery — from materials such as tin and silicon, 
which are cheaper and last longer than lithium. 

The world is gradually moving to greener 
sources of energy, but trapping that power is 
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troublesome because energy is lost every time 
it is moved or converted. Governments and 
industry are investing heavily to improve energy 
storage, and this is translating into research 
opportunities for early-career scientists who 
have skills in chemistry, electrochemistry and 
materials science, and are familiar enough with 
physics and engineering to discuss energy stor- 
age with physicists and engineers. 

Trahey recommends that doctoral students 
who are interested in the field get a multi- 
disciplinary education, particularly in the fields 
of electrochemistry and chemical engineering. 
Inher battery group, she is surrounded by mate- 
rials engineers and physicists who communi- 
cate well with one another. “We speak different 
languages on the same topic,” she says, but she 
learned on the job how to make sure she under- 
stands — and is understood by — those in other 
specialities. She advises early-career researchers 
to hone these skills by attending talks on energy 
storage that lie outside their specific field. 


BOUNDARY BREAKERS 

A collaborative mindset helps to boost employ- 
ability because the battery field requires a 
particularly cooperative spirit owing to its 
complexity. Devin Hodge oversees hiring for 
the Joint Center for Energy Storage Research 
(JCESR) in A, a five-year, US$120-million part- 
nership between government, industry and aca- 
demia that is funded by the US Department of 
Energy (DOE) and located at Argonne. He says 
that scientists who want to work effectively in an 
energy-storage laboratory should have a spirit of 
innovation as well as collaboration. 

Hodge says that research opportunities will 
be opening up at the JCESR in the next year or 
so, but not all is rosy in the field’s hiring out- 
look. Funding, at least in US academia, could 
well tighten; although federal spending on bat- 
tery research has risen in the past five years, 
researchers hoping to grab a slice of that pie 
have glutted the field to some extent, says Brent 
Melot, a chemist at the University of Southern 
California in Los Angeles. “Some people say that 
it’s the worst funding environment because of 
the number of people who are now competing 
for the same opportunities,” he says. “All the 
people who used to research magnets now work 
on energy storage.” 

But industry, including the car and electronic- 
device sectors, is not stymied by the same fund- 
ing crunch as academia. And the field of energy 
storage is growing around the world — meaning 
more jobs. Globally, government-funded bat- 
tery and fuel-cell research and development > 
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> accounted for $8.7 billion from 2008 to 
2012, and that number is expected to increase. 
The French government has allotted €140 mil- 
lion (US$189 million) to automotive-battery 
research. And energy-storage associations also 
cropped up in Asia: the Korea Battery Indus- 
try Association started in 2011; India’s version 
launched in 2012; and China’ Energy Storage 
Alliance started in 2010. Each of these asso- 
ciations supports research efforts and helps to 
move information to partners who can use it. 
The JCESR — launched by the DOE in 
2012 — nowhas 14 national partners, includ- 
ing Johnson Controls in Glendale, Wisconsin, 
and the Dow Chemical Company in Midland, 
Michigan. It aims to develop batteries that 
generate power in different ways, partnering 
lithium with oxygen or sulphur, for example 
(see Nature 507, 26-28; 2014). That means 
opportunities for early-career scientists in 
chemistry, electrochemistry, materials sci- 
ence, nanotechnology, chemical engineering, 
computation and mechanical engineering. 
In the next wave of hiring, some of the 
current postdocs will move into staff- 
scientist positions internally. Hodge says that 
graduate students, and postdocs in particular, 
will learn skills such as computational and 
materials techniques, which will effectively 
position them for high-paying permanent 
positions at research centres or in industry. 


STORAGE STARS 

A constellation of battery-related start-up 
businesses has emerged across the United 
States, and prospects abound for young, tal- 
ented researchers who can innovate using 
chemistry. One of those is Vincent Giordani, 
a senior scientist at Liox Power, an energy 
research-and-development company in Pasa- 
dena, California. 

Giordani acknowledges that his career 
path was serendipitous. When he presented 
his doctoral thesis on rechargeable non-aque- 
ous lithium-air batteries at a conference in 


Canada, two executives from Liox happened 
to be in the audience. He was hired soon after, 
and moved to the United States two weeks 
after finishing his PhD jointly in France and 
the United Kingdom in 2010. 

Industry presents interesting challenges 
for scientists in battery research, he says. “In 
academia, you have to bring money and write 
papers, but in industry there are higher stakes 
in coming up with a new technology.’ He is 
working on batteries that are more recyclable, 
hold more energy and can push cars farther. 

Indeed, the automotive industry is focusing 
on creating cheaper and longer-lasting batter- 
ies, which should produce opportunities for 
early-career researchers. “The auto industry 
will change more in the next 10 years than 
it has in the past 100 years — a battery is no 
longer a commodity, but an integral part of 
the vehicle itself, with increasing demands 
for power and longevity,’ says David Cue, a 
vice-president at Johnson Controls Power 
Solutions in Milwaukee, Wisconsin, which is 
hiring PhD graduates with a background in 
engineering and materials. 

To succeed in battery research, it helps to 
have a broad view of the necessary compo- 
nents. The shift from characterizing materials 
to working in a battery system can be daunt- 
ing. “It’s especially challenging for chemists 
because energy storage is an engineering- 
focused world,’ says Melot. “If you're not able 
to focus on the whole device, you go down 
dead-end roads.” He recommends that early- 
career researchers get immersed in the wider 
field by gaining experience in labs that look at 
a whole device instead of one tiny part. 

Despite academia’s uncertain outlook, the 
interest in — and funding for — a world of 

better batteries 


“The auto ' continues to rise. 
industry will : In late June, the 
change more in DOE announced 
thenext1l0years a $3.2-million 
thanithasinthe — investment in the 


National Incubator 
Initiative for Clean 
Energy, which will create a network to assist 
small businesses focused on clean energy in 
honing their ideas and bringing green prod- 
ucts to market. The incubators should help 
small businesses to grow, creating more jobs 
for early-career researchers. 

The International Energy Agency estimates 
that by 2018 one-quarter of the electricity pro- 
duced worldwide will come from renewable 
sources. Wherever clean energy is, there will 
always be a need to store it in a better way. 
“There are a lot of opportunities,’ says Cue, 
who has worked in Germany, China and 
France as well as in the United States. “And 
there aren't enough electrochemists to fill the 
demand.” = 


past 100 years.” 


Katharine Gammon is a freelance writer in 
Santa Monica, California. 
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RESEARCH AND DEVELOPMENT 
Outsourcing trends 


US drug-makers are outsourcing 

more and more of their research and 
development, mainly to contract research 
organizations, according to figures 
released in July by the US National 
Science Foundation (NSF). The shift 
could be good news for researchers 
seeking positions in industry. In 1991, 
pharmaceutical companies spent 

about US$800 million on external 
research and development, but that 
skyrocketed to $13 billion in 2011, says 
John Jankowski, head of research and 
development statistics at the NSE. That 
growth outstrips that of any other sector. 
In 1991, industrial extramural research 
spending totalled $3.3 billion, but by 
2011, spending had risen to $25.3 billion 
for domestic companies alone. 
Pharmaceutical firms’ share of that total 
was 23% in 1991, but ballooned to 51% 
by 2011. Jankowski says that much of the 
increase comes from the outsourcing of 
clinical trials. The number of US contract 
research organizations has risen to match 
the demand, from around 800 in 2000 

to more than 3,100 by the end of 2011, 
according to the Tufts Center for the 
Study of Drug Development in Boston, 
Massachusetts. 


EMPLOYMENT LAW 
Graduate rights 


The American Association of University 
Professors in Washington DC has 

filed a legal document to the National 
Labor Relations Board arguing that 
graduate assistants, including research 
technicians, at private institutions 
should be considered employees and 
should therefore have collective- 
bargaining rights. The brief argues that 
the board should revise its definition 

of employee status, which is based on a 
2004 decision that graduate assistants at 
Brown University in Providence, Rhode 
Island, were not employees because 
their work was inextricably linked to 
their study. Union representation of 
graduate assistants is a contentious 
issue. In 2012, Michigan banned 
graduate-student research assistants 

in public universities from unionizing, 
arguing that giving students employee 
status would alter the student-teacher 
relationship. In 2008, research assistants 
at the Research Foundation of the State 
University of New York voted to elect 
union representation after a 2007 board 
ruling that they were fundamentally 
employees. 


Ua SCIENCE FICTION 


YOUR APPLICATION FOR ETERNAL 
LIFE HAS BEEN PARTIALLY APPROVED 


BY JAMES WESLEY ROGERS 


his is an important message from the 
ik Existence Committee. Press 

1 to receive this message in English. 
Press 2 to receive this message in Spanish. 
Press 3 to have this message directly injected 
into your brain using Chomsky universal 
grammar. You have three seconds to choose. 

Congratulations, Mr Lawson. We have con- 
firmed that you have less than 6 weeks to live. 
You are now eligible for total fermionic regen- 
eration. Your application has been approved 
with a continuity coefficient of 0.8 (80%). 

You will be regenerated via a process origi- 
nally developed for quantum teleportation. 
First, you will be subatomically scanned, 
using what is referred to as a ‘destructive 
read. Essentially, your body will be com- 
pletely destroyed as it is converted into pure 
information. Next, we will apply powerful 
error-correcting algorithms to your data, 
and reconstitute your body with all patho- 
logical conditions eliminated. Your data will 
not be shared with any third party. 

During reconstitution, your personality 
will be slightly altered, using a simple linear 
interpolation between yourself (80%) and a 
standard personality template (20%). 

Your continuity coefficient of 0.8 (80%) 
was determined by a number of factors: 

1) Your current profession, astronaut, is 
considered nonessential. By adjusting your 
aptitudes, we hope to nudge you towards a 
more useful vocation, such as obesity coun- 
selling or motivational performance art. 

2) You have a poor record of separating 
your recyclables. To have any chance at all of 
being an effective obesity counsellor, you will 
need a greater sense of civic responsibility. 

3) We have detected a disturbing pattern 
of negativity towards Canadian pop stars on 
your social media accounts. Such cynicism 
and poor aesthetic judgement have no place 
in an enlightened society. 

As your continuity coefficient is less than 
the legal threshold of 94% established by 
Jones-B v. California, your derived self will 
not legally be Linus Lawson, but rather 

your direct descend- 
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Allthat matters. 


Lawson-B without probate or inheritance tax. 
Your academic and professional qualifications 
will also be transferred, 
contingent on a series of 
review examinations. 

Linus Lawson-B’s age 
will be legally established 
as 21 at the time of regen- 
eration. Any contractual 
agreement, including 
marriage, to which Linus 
Lawson is a party may be 
voided at this time. 

Press 1 to review fre- 
quently asked questions 
about fermionic regen- 
eration. Press 2 to proceed 
directly to the next section. 
You have three seconds to 
choose. 


Frequently asked questions 
Will the reconstituted 
person really be me? 
For a continuity coefficient 
of 1.0 (100%), the general consensus is yes, 
both legally and ontologically. For lower 
continuity coefficients, there is much debate. 
Ask yourself this: am I the same person 
when I wake up in the morning as I was the 
night before? Research has shown that during 
periods of intense synaptic reorganization, 
the continuity coefficient of someone wak- 
ing from a deep sleep can be as low as 0.996 
(99.6%) As neural adaptation and memory 
degradation accumulate every day, are you 
still the same person you were one year ago? 
It has been postulated that a human being 
is not one continuous person throughout his 
or her life, but rather a sequence of discrete 
persons, each with merely a perception of 
continuity with the preceding individuals. 
Perhaps it is this perception of continuity 
that is the functional definition ofa soul. If 
that is the case, and your derived self has this 
perception of continuity with you, then he or 
she will be you in the only meaningful sense. 
The most we can say then, is that there is a 
very high probability that your derived self 
will have a perception of continuity with you. 


What about my loved ones? 


All memories of your friends and fam- 
ily members will be transferred, and your 
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derivative self may retain some of your emo- 
tional attachment to them. 


Will I ever need to be 
regenerated again? 
Barring accident, no. 
Genetic and cellular 
improvements made during 
error correction will result 
ina derivative body that can 
be indefinitely maintained 
in optimal condition by 
conventional therapies. 


Will you keep a back-up 
of my data? 
Unfortunately, this is a 
practical impossibility. The 
amount of information 
needed to encode a human 
being at the subatomic 
level is unimaginably huge. 
At this time, we possess 
only enough information- 
storage capacity to tempo- 
rarily encode a single human being, and to 
keep one standard template. The standard 
personality template was chosen from among 
the giants of human intellect and accomplish- 
ment of the past seven decades. 


Ive heard that the standard personality 
template is legendary Canadian pop star 
Justin Bieber. Is that true? 

We can neither confirm nor deny this rumour. 
End of FAQ 


Once again, Mr Lawson, congratulations. You 
are about to embark on a great adventure. 
Due to scheduling constraints, your regen- 
eration must begin immediately. Should you 
choose to decline regeneration at this time, 
you will not be allowed to reapply. Press 1 to 
be completely regenerated into a functionally 
immortal form that will retain 80% of your 
current identity. Press 2 to decline regenera- 
tion and live for up to 6 more weeks as Linus 
Lawson. You have three seconds to choose. = 


James Wesley Rogers sometimes allows 
computational geometry to distract him 
from science fiction. These distractions 
occur most often at a 3D modelling software 
company in Columbus, Ohio. 


JACEY 


